Cézar A.
Data Engineer

Skills

Spark

Sql

Apache Hadoop

Java

Python

Amazon Aws

Cézar is available for hire

Hire Cézar A.

All Howdy Candidates are vetted for skills and english proficiency.

Bio

A dedicated student and passionate data science enthusiast, actively seeking an initial opportunity for professional growth and enhancement. The primary goal is to acquire competencies and skills essential for comprehending business strategies and requirements. This includes the ability to manage, design, and develop comprehensive analytical solutions within the field of Data Science.

Data Engineer
8/1/2022 - Present

Developed extensive proficiency in AWS, leveraging services such as S3, Redshift, and EMR for optimized data storage, processing, and analytics. Gained considerable expertise in using Apache Spark for large-scale data processing and analytics tasks, ensuring efficient data transformation and integration processes. Showcased advanced skills in Python, deploying it for scripting, automation, and the development of data pipelines. Utilized Control-M for scheduling and monitoring batch jobs, ensuring streamlined workflows and timely data processing. Demonstrated comprehensive knowledge of SQL for querying, manipulating, and managing relational databases, which facilitated the extraction of insightful and actionable data reports. Successfully applied these technical skills to design and implement scalable data architecture solutions, contributing to improved data quality and operational efficiency.
Data Engineer
5/1/2021 - 8/1/2022

Acquired advanced technical expertise in managing and optimizing Hadoop CDP Clusters for large-scale data processing. Engineered robust data pipelines utilizing Apache NiFi, Apache Hive, and Apache Spark for efficient data ingestion, transformation, and analysis. Leveraged HDFS and YARN to maximize storage capabilities and resource management. Employed Sqoop and Kafka for seamless data migration and real-time data streaming, respectively. Augmented skills in orchestrating workflows using Apache Oozie, and implemented security protocols with Apache Ranger. Proficient in SQL for querying large datasets and developed automated reporting solutions. Gained experience in version control and collaborative development practices using Git.
Big Data Engineering
10/1/2020 - 4/1/2021

Oversaw the design, deployment, and maintenance of a high-performance Hadoop cluster infrastructure. Developed advanced skills in Hadoop ecosystem tools including Hive, Pig, and HBase. Successfully implemented solutions to optimize data storage and processing efficiency, utilizing MapReduce for parallel data processing. Deployed and managed clusters using Ambari and Cloudera Manager, ensuring high availability and fault tolerance.

Extensive experience in scripting and automation using Python and Bash to streamline cluster management tasks. Employed monitoring and logging systems such as Nagios and Splunk to ensure system reliability and performance. Leveraged AWS and Azure to scale infrastructure efficiently, implementing best practices for data security and compliance.

Collaborated closely with data engineers and analysts to develop data pipelines and workflows using Apache NiFi. Applied machine learning models on large datasets, enhancing data analysis capabilities. Demonstrated proficiency in optimizing query performance and troubleshooting complex data processing issues. Conducted regular performance tuning and cluster upgrades to maintain state-of-the-art infrastructure.
Data Engineer
4/1/2020 - 10/1/2020

Developed expertise in AWS services, including S3, EMR, Glue, and Lake Formation, to design and implement data lake solutions. Engineered data ingestion pipelines utilizing AWS Lambda, Kinesis, and Data Pipeline to facilitate seamless data processing. Demonstrated strong skills in ETL processes with AWS Glue, optimizing data transformation and cleaning tasks. Built and maintained data cataloging and searching capabilities using AWS Glue Data Catalog, enabling efficient metadata management and data discovery. Implemented robust security measures with IAM, KMS, and CloudTrail to ensure data governance and compliance. Utilized Redshift and Athena to perform complex queries and data analysis, showcasing proficiency in SQL and data warehousing concepts. Automated cloud infrastructure deployments using CloudFormation and Terraform, ensuring consistent and repeatable setups. Developed and maintained monitoring dashboards with CloudWatch and implemented alerting mechanisms to ensure system reliability and performance. Successfully orchestrated batch and real-time data processing workflows, contributing to data-driven decision-making processes.

Dentistry at Federal University of Bahia
1988 - 1992
Big Data (data science) at Unyleya College
2020 - 2021
Big Data Engineer at Semantix Academy
2021 - 2021
Data Science at Institute of Management and Information Technology
2020 - 2020
Data Science at Digital House
2021 - 2021

Oracle Next Education - Labora: Entrepreneurship at Grupo Alura
11/2/2020
Oracle Next Education - Labora: Java at Grupo Alura
11/2/2020
Certificate of Authority: Algorithm at DevMedia
10/2/2020
Oracle Next Education - Labora: Front-End at Grupo Alura
10/2/2020
Certificate of Authority: Python at DevMedia
10/2/2020
Certificate of Authority: Database at DevMedia
10/2/2020
Introduction to Data Science: Basic Concepts at LinkedIn
9/2/2020
SQL Course at DevMedia
9/2/2020
Introduction to Data Science: How to Tell Stories with Data at LinkedIn
9/2/2020
Introduction to the Brazilian Personal Data Protection Law at Escola Nacional de Administração Pública ENAP
9/2/2020
Programming Logic at Grupo Alura
9/2/2020
Data Analysis in R Language at Escola Nacional de Administração Pública ENAP
8/2/2020
Relational Database Modeling at DevMedia
8/2/2020