A Data Engineer with four years of experience in Engineering and Data Analysis, specializing in converting data into actionable insights across various business domains such as Sales, Finance, Marketing, Customer Success, Governance, and Product. Expertise spans the entire data lifecycle, including extraction, preparation, analysis, and providing recommendations to inform business decisions.
Proficient in designing data pipelines utilizing Python (PySpark) and SQL on Databricks for data ingestion and transformation, along with implementing Big Data solutions on Delta Lake, Data Lake, and Data Warehouse on Azure and AWS cloud platforms. Experienced in developing ETL pipeline routines using Data Factory, custom Python scripts, AWS Lambda, and Airflow, and adept at integrating data from multiple sources like APIs, relational databases (PostgresSQL, MySQL, Oracle, SQL Server, MariaDB, Cassandra), and storage solutions (Azure Data Lake Gen 2, S3). Additionally, proficient in handling various file formats such as Parquet, Delta, CSV, Excel, and JSON.
Demonstrated capability in data analysis, crafting queries in PySpark and SQL to generate reports and dashboards using tools like PowerBI, Tableau, Metabase, and Streamlit. Well-versed in data contracts and versioning scripts within DevOps CI/CD pipelines, and experienced in agile Scrum methodologies within collaborative squad environments.
Other technical proficiencies include Git, Docker, with foundational knowledge of Kubernetes (K8), Terraform, and Jenkins.