Gabriel M.
Data Engineer

Skills

Airflow

Kubernetes

Tableau

Spark

Sql

Databricks

Python

Microsoft Azure

Amazon Aws

Docker Cloud

Gabriel is available for hire

Hire Gabriel M.

All Howdy Candidates are vetted for skills and english proficiency.

Bio

A Data Engineer with four years of experience in Engineering and Data Analysis, specializing in converting data into actionable insights across various business domains such as Sales, Finance, Marketing, Customer Success, Governance, and Product. Expertise spans the entire data lifecycle, including extraction, preparation, analysis, and providing recommendations to inform business decisions.

Proficient in designing data pipelines utilizing Python (PySpark) and SQL on Databricks for data ingestion and transformation, along with implementing Big Data solutions on Delta Lake, Data Lake, and Data Warehouse on Azure and AWS cloud platforms. Experienced in developing ETL pipeline routines using Data Factory, custom Python scripts, AWS Lambda, and Airflow, and adept at integrating data from multiple sources like APIs, relational databases (PostgresSQL, MySQL, Oracle, SQL Server, MariaDB, Cassandra), and storage solutions (Azure Data Lake Gen 2, S3). Additionally, proficient in handling various file formats such as Parquet, Delta, CSV, Excel, and JSON.

Demonstrated capability in data analysis, crafting queries in PySpark and SQL to generate reports and dashboards using tools like PowerBI, Tableau, Metabase, and Streamlit. Well-versed in data contracts and versioning scripts within DevOps CI/CD pipelines, and experienced in agile Scrum methodologies within collaborative squad environments.

Other technical proficiencies include Git, Docker, with foundational knowledge of Kubernetes (K8), Terraform, and Jenkins.

Data Engineer
4/1/2023 - Present

Developed proficiency in data ingestion and processing using PySpark and SQL, interfacing with a variety of data sources including APIs, relational databases, and file formats such as xlms, csv, and json. Constructed data pipelines within Databricks workflow environment, Azure Data Factory, and AWS Lambda. Utilized best practices and architectural principles for cloud solutions, specifically Medallion architecture. Managed versioning, documentation, and data contract creation within the Azure DevOps environment. Integrated Scrum model rituals throughout sprints to enhance team efficiency. Maintained strong communication with stakeholders to ensure valuable and timely deliveries.
Junior Data Analyst
2/1/2022 - 3/1/2023

Developed proficiency in data ingestion and processing using PySpark and SQL, with experience handling data from APIs, and the Lakehouse ecosystem's raw, silver, and gold tables within the Databricks environment on AWS. Constructed robust data pipelines in Databricks, utilizing customized notebooks, YML configuration files, and Git for orchestration, with Airflow operating in the background. Excelled in ETL development within Tableau, leveraging data sources from both Databricks and Snowflake. Created insightful dashboards in Tableau, enabling enhanced data visualization. Conducted detailed data analysis to extract actionable insights for business decision-making. Embraced Scrum model practices during sprints, efficiently using Jira for task management. Fostered strong communication with stakeholders and clients to ensure value-driven deliveries.
Junior Data Analyst
3/1/2021 - 2/1/2022

Participated in the formation of the Data Squad, emphasizing good DevOps practices such as Git and Gitlab within an Agile framework. Developed ETLs using Python and SQL scripts, extracting data from various sources including APIs, relational databases, and spreadsheets for the creation of a pseudo data warehouse. Automated data flows within EC2 AWS using Jenkins. Created a centralizing dashboard that consolidated data from finance, CRM, and customer success departments using Data Studio. Developed a churn analysis dashboard leveraging product and financial data to identify key metrics for regression models (linear and tree) and K-Means via machine learning, with visualization through Streamlit.
Data Science Intern
5/1/2020 - 3/1/2021

Developed SQL queries in PostgreSQL on AWS using Metabase for data extraction and report automation. Monitored product metrics including feature access and bugs, creating dashboards on Metabase to track KPIs. Structured and created dashboards for the customer support team using Zendesk’s SQL capabilities. Conducted churn analysis through machine learning using Python, enhancing predictive analytics and retention strategies.