FE.

// DATA SCIENTISTNAPOLI, ITALY

Fauzan
Ejaz

Turning raw signals into forecasts, clusters and decisions— machine learning & MLOps for the real world.

scroll to explore the data ↓

+0%

RMSE / R² lift via cluster-based forecasting

0%

F1-score — BERT sentiment on 360K+ reviews

0%

mAP — YOLOv8 visual quality control

0K+

Amazon reviews processed with NLP

[01]

About

Data-driven professional with hands-on experience in data analysis, statistical modeling, and machine learning. Transformed into a Data Scientist. Skilled in Python, SQL, and predictive analytics with a strong foundation in automation and MLOps practices. Passionate about transforming complex data into actionable insights that enhance decision-making, optimize performance, and drive business growth within collaborative, innovation-focused teams.

📍 Napoli, Italy+39 342 1014345+91 9832887252

lang_proficiency.plot( )

ItalianB1
EnglishC1
ArabicA1
HindiC1
UrduC1
BengaliB1
[02]

Skills

stack.stream( ) — hover to pause

PythonRSQLScalaApache SparkETL PipelinesData WarehousingAWS (S3, EC2, Lambda, Redshift)Machine LearningGenerative AIPythonRSQLScalaApache SparkETL PipelinesData WarehousingAWS (S3, EC2, Lambda, Redshift)Machine LearningGenerative AI
Big Data ProcessingPandasNumPyMatplotlibPower BIModel OptimizationCI/CDGitAgile & ScrumBig Data ProcessingPandasNumPyMatplotlibPower BIModel OptimizationCI/CDGitAgile & Scrum

soft_skills[ ]

Analytical ThinkingProblem SolvingOwnership & AccountabilityCollaboration & CommunicationTime ManagementAdaptabilitySelf-Motivated & Goal-Oriented
[03]

Experience

JUL 2025 – DEC 2025 · NAPLES, ITALY

Data Scientist

@ GEKO S.p.A — Energy & Ambient

  • Developed a production-ready time series forecasting pipeline using Python, Pandas, Scikit-learn, and XGBoost, predicting hourly electricity consumption across Italian zones with improved accuracy through Optuna hyperparameter tuning and temporal cross-validation.
  • Implemented unsupervised clustering (K-Means, DTW, HAC) on zone-wise consumption and weather patterns to identify region-specific behaviours, enabling cluster-based model training that improved RMSE and R² performance by over 15% compared to baseline models.
  • Engineered advanced features including lag variables, rolling statistics, Fourier seasonality terms, and weather-based indicators (temperature, humidity, altitude), and automated model training, evaluation, and saving workflows for each cluster — preparing the pipeline for production deployment and monitoring.

JAN 2022 – OCT 2023 · KOLKATA, INDIA

Data Analyst

@ LTIMindtree

  • Utilized SQL for database querying and data extraction, Python for data wrangling and analysis, and Excel for data manipulation and visualization.
  • Developed and maintained automated reporting dashboards using Tableau to track key performance indicators and metrics.
[04]

Projects

01

proj_01

End-to-End Visual Quality Control System (MLOps)

Designed an end-to-end visual quality control system for manufacturing, leveraging YOLOv8 and PyTorch to achieve 78.9% mAP on the MVTec AD dataset. Engineered an automated data pipeline handling versioning and segmentation-mask-to-bounding-box conversion for reproducibility. Optimized the model for edge deployment via ONNX export, reducing inference latency on CPU, with a Streamlit dashboard and MLflow for real-time visualization and experiment tracking.

YOLOv8PyTorchONNXMLflowStreamlit
02

proj_02

GDPR-Guardian

A GDPR compliance & privacy-by-design service for the Italian/European market that automatically detects and redacts sensitive data such as the Codice Fiscale using custom Presidio recognizers and spaCy's Italian model, ensuring data safety before it reaches the database. Demonstrates full end-to-end containerization with Docker, an API gateway for data governance, and full-stack data science from custom logic to deployment.

PresidiospaCyDockerData Governance
03

proj_03

Sentiment Analysis on Amazon Food Reviews (BERT/RoBERTa)

Preprocessed 360K+ Amazon food reviews using NLP techniques (tokenization, stop-word removal, TF-IDF). Fine-tuned a transformer-based BERT/RoBERTa model to classify sentiment into positive, neutral, and negative, achieving an 88% F1-score and outperforming VADER and baseline models. Handled imbalanced classes, implemented multilingual detection, and served the model with MLflow plus live monitoring via Weights & Biases in a Docker-ready deployment.

BERTRoBERTaNLPMLflowWeights & Biases
04

proj_04

Drug Store Price Prediction

Analyzed sales data for Rossmann, a major German drugstore chain with a strong European presence, using an open-source Rossmann dataset to surface insights that enhance the company's decision-making and operational strategies.

ForecastingEDAMachine Learning
05

proj_05

Gesture Detection — Volume Control

Enables gesture-based control of computer volume using hand tracking. A hand-tracking module detects key landmarks to recognize gestures; moving the hand closer or farther adjusts the volume for a seamless, hands-free experience, built with a modular approach that simplifies development.

Computer VisionHand TrackingPython
06

proj_06

AI-Powered Commerce Agent

An AI-powered commerce agent built with LangGraph and the OpenAI API to automate product recommendations, order lookups, and policy-based cancellations. Implemented strict policy enforcement, traceable decision logging, and modular testing for intelligent and secure e-commerce interactions.

LangGraphOpenAI APIAgents
[05]

Education

2023 – 2026

Master's in Data Science

University of Napoli Federico II, Italy

2018 – 2021

Bachelor of Computer Application

Asansol Engineering CollegeGrade 9/10

[06] — model.predict(next_role)

Let's turn data
into decisions

+39 342 1014345+91 9832887252Napoli, ItalyGitHub
[07]

Visitor Analytics

visits.count( )

0

total visits · 0 countries

live — geo via edge headers

visits.groupby("country")

collecting data — you are the first visitor ✦