MAD Program: Flexible Data Science and Machine Learning Curriculum

Learning Model: Learning by Doing

This program emphasizes practical, hands-on learning in a business environment.

Key Points:

  • Real Tasks and Projects: Interns work on actual or simulated business tasks.
  • Skill Development: Focuses on problem-solving, critical thinking, and project-related skills.
  • Support: Mentors provide guidance, feedback, and support.

Self-Directed Learning and Mentor Support

Learning Structure:

  • Self-Directed (80%): Interns use platforms like UdemySnowflake, and Microsoft Learning to work independently on courses and challenges.
  • Mentor Support (20%): Each intern has a mentor who offers guidance, feedback, and periodic check-ins.

Learning Goals and Feedback

  • Goal Setting: Interns set specific learning goals with mentors at the beginning of each cycle.
  • Review and Transition: Upon course completion, mentors review outcomes and set new learning paths based on the intern’s development and company needs.

Daily Progress Tracking and Mentor Interaction

Daily Reports:

Interns submit a daily report on:
  • Learning Activities: Topics studied or projects worked on.
  • Challenges: Any obstacles faced.
  • Reflections: What was learned and applied.

Communication:

  • Slack: Used for real-time support from mentors and peers.

Flexible Data Science and Machine Learning Curriculum: Practical Guidance Through Real-World Projects and Simulated Exercises
Business Problem Situation General Concept Area Specific Objective Associated Content/Concepts Notebooks and Challenges Microsoft Learning Path / Udemy / Kaggle Real Scenario in the Company
Sales Data Inconsistency at Adventure Works: The company has noticed discrepancies in sales data across regions. They need to clean and standardize the data for accurate reporting. Data Cleaning and Preprocessing Apply – Implement appropriate data cleaning techniques. Data cleaning, handling missing values, data standardization, outlier detection. – Notebook 1 ** – Handling Missing Values – San Francisco Explore and analyze data with Python Tasks will be assigned according to ongoing data projects.
Customer Segmentation for Targeted Marketing: Adventure Works wants to segment customers to tailor marketing campaigns more effectively. Exploratory Data Analysis (EDA) Analyze – Categorize customer data characteristics and perform clustering to identify distinct segments. Clustering techniques (e.g., K-means), customer profiling, data visualization for segmentation. – Notebook 2 – House Prices Different Clustering Techniques and Algorithms – Kaggle Tasks will be assigned according to ongoing data projects.
Predicting Product Demand: Adventure Works needs to forecast product demand for better inventory management. Predictive Modeling Classify – Use historical sales data to classify and predict future demand for various products. Time series analysis, regression models, forecasting techniques. – Notebook 3 – Corporación Favorita Grocery Sales Forecasting Time Series – Kaggle Tasks will be assigned according to ongoing data projects.
Analyzing Customer Churn: The company is concerned about losing customers and wants to predict churn to take proactive measures. Predictive Modeling Compare – Evaluate different predictive models to determine the most accurate for predicting customer churn. Logistic regression, decision trees, performance metrics (e.g., accuracy, recall). – Notebook 4 – Telco Customer Churn Python for Data Analysis: Logistic Regression Techniques – Udemy Tasks will be assigned according to ongoing data projects.
Optimizing Pricing Strategies: Adventure Works is exploring optimal pricing strategies to maximize profit margins across different regions. Optimization and Algorithms Apply – Implement and compare pricing algorithms to find the optimal pricing strategy. Price elasticity, optimization algorithms, A/B testing. – Notebook 5 – Rossmann Store Sales Mastering Machine Learning Algorithms using Python – Udemy Tasks will be assigned according to ongoing data projects.
Visualizing Sales Trends for Executives: The sales team needs to present sales trends and forecasts to executives in a clear and compelling way. Data Visualization and Communication Design – Create interactive dashboards that visualize sales data trends and forecast future sales. Data visualization principles, dashboard design, tools like Tableau or Power BI. – Notebook 6 – Store Sales Implement advanced data visualization techniques by using Power BI Tasks will be assigned according to ongoing data projects.
Developing a Recommender System for Adventure Works: The company wants to implement a recommender system to suggest products to customers based on their purchase history and browsing behavior. Machine Learning and Deep Learning (MML) Design – Develop and implement a recommender system using advanced machine learning techniques. Collaborative filtering, content-based filtering, neural networks, deep learning techniques. – Notebook 7 – MovieLens Recommendation Systems Azure Data Scientist self-paced training Tasks will be assigned according to ongoing data projects.
Building a Chatbot for Customer Service: Adventure Works wants to deploy a chatbot that can assist customers with common queries using AI-driven natural language processing. Machine Learning and Deep Learning (MML) Create – Design and implement a chatbot using modern natural language processing techniques. Natural language processing (NLP), chatbot frameworks (e.g., Rasa, Dialogflow), neural networks. – Notebook 8 – Twitter Sentiment Analysis Develop natural language processing solutions Tasks will be assigned according to ongoing data projects.
Exploring Large Language Models (LLMs) and Generative AI: A tech company is exploring the use of Generative AI to automate content creation and enhance customer interactions. Machine Learning and Deep Learning (MML) Apply – Implement the foundational concepts of LLMs and Generative AI in real-world applications. Large Language Models (LLMs), Generative AI, AI-driven content creation, application prototyping. – Notebook 9 – Introduction to Generative AI – Databricks Optional resources: – LangChain with Python Bootcamp – Udemy – AI Python for Beginners – DeepLearning.AI  Foundations of Data Science for Machine Learning *** ] Tasks will be assigned according to ongoing data projects.
**Note: The notebooks (1 to 9) provided in this curriculum are for internal use only. Their URLs are not to be published or shared externally. ***Note: The exercise runs perfectly locally but generates an error on the Microsoft website.

Tools and Technologies

Key technologies and tools in this curriculum include:
  • Git: Version control
  • Azure DevOps: CI/CD, project management
  • Docker: Containerization
  • SQL and Python: Data querying and analysis
  • Power BI, Tableau: Data visualization
  • Databricks, Snowflake: Data processing and analytics

Comprehensive Azure DevOps Path
General Concept Area Specific Objective (Bloom’s Taxonomy) Associated Content/Concepts Microsoft Learning Path
Communication and Collaboration in DevOps Understand – Facilitate effective communication and collaboration Continuous planning, continuous collaboration, communicating deployment and release info AZ-400: Facilitate Communication and Collaboration
Evolving DevOps Practices Analyze – Evaluate and optimize DevOps practices Value stream mapping, Azure DevOps setup, Azure Boards, workload optimization for sprints Evolve Your DevOps Practices
Foundations of DevOps Analyze – Identify challenges in traditional application lifecycles Application lifecycle management, challenges of traditional vs. DevOps approaches, operational inefficiencies DevOps Foundations: Core Principles and Practices

Evaluation and Progress

Performance is assessed through:
  • Daily Progress Tracking: Regular updates and reflections on learning.
  • Institutional Evaluations: Formal evaluations based on performance, skill application, and potential for future roles.