RMRM Full Stack & AI Engineer · All projects · Roadmaps
Data · project ideas

Data Science Project Ideas

Build real data science projects spanning EDA, machine learning, NLP, deep learning, and deployment to develop end-to-end skills.

Exploratory Data Analysis Dashboard

beginner

Analyze a public dataset (e.g. Titanic or Iris) and build an interactive dashboard to communicate insights.

Requirements
pandas data wranglingdata visualizationEDA methodologyStreamlit/Dashstatistical summaries

House Price Regression Model

beginner

Predict house sale prices using the Ames Housing dataset with feature engineering and regression models.

Requirements
scikit-learnfeature engineeringregression modelingmodel evaluationdata preprocessing

Customer Churn Classifier

beginner

Build a binary classification model to predict which telecom customers will churn using real-world structured data.

Requirements
classificationimbalanced data handlinghyperparameter tuningSHAP explainabilityscikit-learn pipelines

Movie Recommendation Engine

intermediate

Build a collaborative and content-based hybrid recommendation system using the MovieLens dataset.

Requirements
collaborative filteringTF-IDFrecommendation systemsmatrix factorizationmodel evaluation

Twitter Sentiment Analysis Pipeline

intermediate

Build an end-to-end NLP pipeline to classify tweet sentiment and visualize trends over time.

Requirements
NLP preprocessingTF-IDFHuggingFace TransformersBERT fine-tuningpipeline design

Time Series Demand Forecasting

intermediate

Forecast retail product demand using classical and ML-based time series methods on the Rossmann store dataset.

Requirements
time series analysisARIMAfeature engineeringXGBoostforecasting evaluation

Image Classification with CNNs

intermediate

Train a convolutional neural network to classify images from CIFAR-10, applying transfer learning and data augmentation.

Requirements
CNNstransfer learningdata augmentationPyTorch/TensorFlowmodel training & evaluation

End-to-End ML Model Deployment

advanced

Train a fraud detection model and deploy it as a production-ready REST API with monitoring and CI/CD.

Requirements
MLflow experiment trackingFastAPIDockermodel deploymentdata drift monitoring

End-to-End Kaggle Competition Pipeline

advanced

Simulate a full competitive data science workflow by building a stacked ensemble for a structured prediction problem.

Requirements
ensemble methodsstackingcross-validation strategyCatBoost/LightGBMcompetitive ML workflow
Stuck on a build? Our AI tutor reviews your code and unblocks you — without writing it for you.
Open the app — free to start

© RM Full Stack & AI Engineer · All projects · Roadmaps · Open the app