Howdy Logo
Image of Vítor M.

Vítor M.
Data Scientist

Pl/sql
Al
Github
Java
Python
Elasticsearch
Amazon Aws
Google Cloud
Bio

An experienced professional in information processing technologies, specializing in Natural Language Processing (NLP), Machine Learning (ML), and Information Retrieval (IR). With over 8 years in the field, roles have included data scientist in R&D projects, AI-focused initiatives, and as a lecturer and PhD candidate in Business Intelligence and Information Retrieval with an emphasis on NLP. Proficient in utilizing Deep Learning Language Models (LM/LLM), developing indexing techniques, and conducting information system searches through crawling and searching methodologies, in addition to expertise in distributed data processing and analysis.

Career involvement has encompassed large-scale projects handling massive datasets and developing machine learning algorithms for extracting meaningful insights. Experiences include implementing supervised and unsupervised machine learning models for a variety of applications such as text classification, summarization, and the fine-tuning of large language models for specialized problem-solving.

Possesses extensive knowledge in distributed system architectures and parallel processing, with a particular focus on the Apache Spark ecosystem. Proficiency includes working within environments that demand high efficiency and scalability, contributing to the optimization of data processing pipelines (ETL) and the deployment of resilient big data solutions.

  • Data Scientist
    6/1/2013 - 1/1/2014

    Developed proficiency in data analysis by building radio streams and remodeling the database within the Radialize system. Utilized advanced data manipulation and database restructuring techniques to enhance the efficiency of data processing and storage. Employed tools and frameworks such as SQL, Python, and various ETL (Extract, Transform, Load) processes to handle and transform large datasets. Gained experience in optimizing database performance and ensuring data integrity. Demonstrated ability to work with streaming data sources and implement robust data modeling solutions.

  • Data Scientist
    8/1/2023 - 1/1/2024

    Led the training, deployment, and evaluation of language models and large language models within the Natural Language Processing (NLP) field, focusing on the Human Resources domain. Developed expertise in cutting-edge NLP frameworks and tools, optimizing models for specific HR applications. Utilized machine learning platforms to streamline model deployment and ensure high performance and accuracy. Conducted thorough evaluations to measure the effectiveness of models in real-world HR scenarios, driving continuous improvement and innovation in NLP methodologies.

  • Substitute Professor
    2/1/2023 - 7/1/2023

    Assumed the role of Substitute Professor on a temporary contract for courses including Information Retrieval on the Web and Social Networks, Business Intelligence, and Analysis and Development of Systems. Gained substantial experience delivering educational content on the intricacies of web and social network information retrieval. Cultivated skills in teaching methodologies for Business Intelligence concepts, encompassing data analysis and decision-making frameworks. Facilitated multiple classes on the principles and practices of systems analysis and development, emphasizing system design, development life cycle, and project management.

  • Data Scientist
    5/1/2020 - 2/1/2023

    Developed expertise in integrating databases and implementing data deduplication strategies, while defining the data architecture for the MPMG project. Applied active learning algorithms for pairing and aimed to generalize data integration solutions with an emphasis on the construction of Master Data Management (MDM) systems. Supervised and monitored the progress of activities performed by other scholarship holders.

  • Assistant Researcher
    7/1/2016 - 7/1/2018

    Developed expertise in textual analysis and linguistic algorithms, including advanced techniques such as TF-IDF, LDA, feature engineering, and word embeddings, while working on the development of YAKE (Yet Another Keyword Extractor). Engaged in the creation and implementation of unsupervised keyword extraction algorithms, one of the leading tools in the field. Gained hands-on experience with automated systems for executing and evaluating YAKE against numerous other algorithms across various multilingual datasets. This role deepened knowledge of natural language processing (NLP) and reinforced a dedication to excellence in NLP research.

  • professor
    6/1/2013 - 1/1/2014

    Taught courses in Statistics and Algorithms and Data Structures II as part of a national technical education program. Developed proficiency in creating comprehensive curriculum plans and instructional materials. Leveraged statistical software and programming tools to facilitate student learning, ensuring a thorough understanding of statistical methods and data structures. Applied pedagogical techniques to explain complex algorithms, data sorting, and management concepts. Enhanced technical expertise by incorporating real-world problem-solving scenarios and practical applications in the teaching process. Utilized classroom management systems to monitor student progress and provide continuous feedback. Demonstrated strong communication skills through lectures, workshops, and individual consultations, fostering an engaging and supportive learning environment.

  • Computer Science at Federal University of Minas Gerais
    2018 - 2025

  • Computer Science at Federal University of Ouro Preto
    2009 - 2014

  • Neural Networks and Deep Learning at deeplearning.ai no Coursera
    3/1/2018

  • Text Mining at Advanced Statistics and Data Mining Summer School
    5/31/2017

  • Big Data with Apache Spark at Advanced Statistics and Data Mining Summer School
    5/31/2017

Vítor is available for hire

Meet Vítor M.
Check icon

All Howdy Candidates are vetted for skills and english proficiency.