Mateus Restier MateusRestier

Hey, I'm Mateus 👋

Data Scientist | AI & Cloud Data Engineering | Automation Enthusiast
Specialize in Generative AI (RAG), Machine Learning, and Cloud Architectures (AWS/Snowflake), turning complex data into strategic insights 💡

🧠 About Me

I'm a Data Scientist based in Rio de Janeiro, Brazil 🇧🇷.
Currently, I work at FGV IBRE, focusing on Generative AI, Machine Learning, and Data Engineering within the AWS ecosystem. My work involves building advanced NLP pipelines, developing classification models, and optimizing cloud data processes using Snowflake and Streamlit.

I hold a B.Sc. in Computer Science and have a strong background in automating business processes and BI from my previous experiences at Bagaggio and Enel.

Fun fact: Before diving into data, I was a professional e-sports player, a journey that sharpened my resilience, strategic thinking, and ability to perform under high pressure 🎮.

🛠️ Tech Stack

Languages & Frameworks

Cloud & Data Engineering

🚀 Featured Projects

🔹 Insight-Invest 🔓

End-to-end automated stock analysis, forecasting, and recommendation system using web scraping, RandomForest models, PostgreSQL, and an interactive Dash/Plotly dashboard.

🔹 automated-economic-releases 🔒

Automated bulletin generation for economic indicators (INCC-M, IGP-M, IGP-DI, ICOMEX): Excel ingestion, LLM-driven narrative writing, and Word (.docx) rendering via docxtpl — with a Streamlit UI for operator input.

🔹 document-vector-pipeline 🔒

ETL pipeline for document vectorization: ingests raw files, extracts and chunks text, generates embeddings, and stores them in a vector database for downstream semantic search and retrieval.

🔹 rag-framework 🔒

Modular Retrieval-Augmented Generation (RAG) framework integrating vector search with LLM inference to answer queries grounded in private document collections.

🔹 sound-dna 🔓

End-to-end pipeline for music genre classification: YouTube ingestion (yt-dlp), 369 audio features via DSP (librosa), ML models (XGBoost/Random Forest), and a Streamlit app with interactive spectral analysis + genre prediction.

🔹 football-dataops-lakehouse 🔓

Local football data lakehouse simulating AWS architecture (S3 → Athena → MWAA) using MinIO, DuckDB, Dagster, and Great Expectations — ingesting StatsBomb open data through a medallion pipeline (raw → validated → trusted).

🔹 competitor-analysis 🔒

Competitor analysis automation pipeline: web scraping, data consolidation, and structured reporting to support strategic pricing and market positioning decisions.

🔹 joybind 🔓

JoyBind maps controller buttons to custom keyboard strokes and absolute screen coordinates. Built with Python to simplify and automate macro interactions in games.

🔹 venv-creation 🔓

A simple .bat file that creates a virtual environment (venv), installs all dependencies listed in requirements.txt, and activates the environment.

🔹 auto-keyboard-typing 🔓

A simple program to type whatever you want, usefull in sites where the crtlV are locked.

🎯 Currently Focused On

🤖 Generative AI & RAG: Building systems that combine LLMs with external data sources for specialized context.
🧠 Machine Learning: Applying algorithms and models for classification, prediction, and pattern identification in complex datasets.
☁️ Cloud Architecture: Optimizing data pipelines and efficiency on AWS and Snowflake.
📊 Scalable ETL: Ensuring high-performance data processing for large-scale indicators.

🤝 Let's Connect!

Whether it's about AI, Cloud Engineering, Automation, or Retro Gaming, I'm always happy to chat!
📬 LinkedIn | ✉️ restier2001@gmail.com

“I believe in data with purpose — not just numbers, but stories that move people and businesses.”

Provide feedback

Saved searches

Use saved searches to filter your results more quickly