Skip to content
View MateusRestier's full-sized avatar
๐ŸŽฏ
Focusing
๐ŸŽฏ
Focusing

Highlights

  • Pro

Block or report MateusRestier

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
MateusRestier/README.md

Hey, I'm Mateus ๐Ÿ‘‹

Data Scientist | AI & Cloud Data Engineering | Automation Enthusiast
Specialize in Generative AI (RAG), Machine Learning, and Cloud Architectures (AWS/Snowflake), turning complex data into strategic insights ๐Ÿ’ก


๐Ÿง  About Me

I'm a Data Scientist based in Rio de Janeiro, Brazil ๐Ÿ‡ง๐Ÿ‡ท.
Currently, I work at FGV IBRE, focusing on Generative AI, Machine Learning, and Data Engineering within the AWS ecosystem. My work involves building advanced NLP pipelines, developing classification models, and optimizing cloud data processes using Snowflake and Streamlit.

I hold a B.Sc. in Computer Science and have a strong background in automating business processes and BI from my previous experiences at Bagaggio and Enel.

Fun fact: Before diving into data, I was a professional e-sports player, a journey that sharpened my resilience, strategic thinking, and ability to perform under high pressure ๐ŸŽฎ.


๐Ÿ› ๏ธ Tech Stack

Languages & Frameworks

Cloud & Data Engineering


๐Ÿš€ Featured Projects

๐Ÿ”น Insight-Invest ๐Ÿ”“

End-to-end automated stock analysis, forecasting, and recommendation system using web scraping, RandomForest models, PostgreSQL, and an interactive Dash/Plotly dashboard.

๐Ÿ”น automated-economic-releases ๐Ÿ”’

Automated bulletin generation for economic indicators (INCC-M, IGP-M, IGP-DI, ICOMEX): Excel ingestion, LLM-driven narrative writing, and Word (.docx) rendering via docxtpl โ€” with a Streamlit UI for operator input.

๐Ÿ”น document-vector-pipeline ๐Ÿ”’

ETL pipeline for document vectorization: ingests raw files, extracts and chunks text, generates embeddings, and stores them in a vector database for downstream semantic search and retrieval.

๐Ÿ”น rag-framework ๐Ÿ”’

Modular Retrieval-Augmented Generation (RAG) framework integrating vector search with LLM inference to answer queries grounded in private document collections.

๐Ÿ”น sound-dna ๐Ÿ”“

End-to-end pipeline for music genre classification: YouTube ingestion (yt-dlp), 369 audio features via DSP (librosa), ML models (XGBoost/Random Forest), and a Streamlit app with interactive spectral analysis + genre prediction.

๐Ÿ”น football-dataops-lakehouse ๐Ÿ”“

Local football data lakehouse simulating AWS architecture (S3 โ†’ Athena โ†’ MWAA) using MinIO, DuckDB, Dagster, and Great Expectations โ€” ingesting StatsBomb open data through a medallion pipeline (raw โ†’ validated โ†’ trusted).

๐Ÿ”น competitor-analysis ๐Ÿ”’

Competitor analysis automation pipeline: web scraping, data consolidation, and structured reporting to support strategic pricing and market positioning decisions.

๐Ÿ”น joybind ๐Ÿ”“

JoyBind maps controller buttons to custom keyboard strokes and absolute screen coordinates. Built with Python to simplify and automate macro interactions in games.

๐Ÿ”น venv-creation ๐Ÿ”“

A simple .bat file that creates a virtual environment (venv), installs all dependencies listed in requirements.txt, and activates the environment.

๐Ÿ”น auto-keyboard-typing ๐Ÿ”“

A simple program to type whatever you want, usefull in sites where the crtlV are locked.


๐ŸŽฏ Currently Focused On

  • ๐Ÿค– Generative AI & RAG: Building systems that combine LLMs with external data sources for specialized context.
  • ๐Ÿง  Machine Learning: Applying algorithms and models for classification, prediction, and pattern identification in complex datasets.
  • โ˜๏ธ Cloud Architecture: Optimizing data pipelines and efficiency on AWS and Snowflake.
  • ๐Ÿ“Š Scalable ETL: Ensuring high-performance data processing for large-scale indicators.

๐Ÿค Let's Connect!

Whether it's about AI, Cloud Engineering, Automation, or Retro Gaming, I'm always happy to chat!
๐Ÿ“ฌ LinkedIn | โœ‰๏ธ restier2001@gmail.com


โ€œI believe in data with purpose โ€” not just numbers, but stories that move people and businesses.โ€

Pinned Loading

  1. insight-invest insight-invest Public

    End-to-end automated stock analysis, forecasting, and recommendation system using web scraping, RandomForest models, PostgreSQL, and an interactive Dash/Plotly dashboard.

    Python

  2. auto-keyboard-typing auto-keyboard-typing Public

    A simple program to type whatever you want, usefull in sites where the crtlV are locked

    Python

  3. joybind joybind Public

    JoyBind maps controller buttons to custom keyboard strokes and absolute screen coordinates. Built with Python to simplify and automate macro interactions in games.

    Python

  4. sound-dna sound-dna Public

    End-to-end pipeline for music genre classification: YouTube ingestion (yt-dlp), 369 audio features via DSP (librosa), ML models (XGBoost/Random Forest), and a Streamlit app with interactive spectraโ€ฆ

    Python