Lecture 6

Contents

Part 1

Q1: How would you define Machine Learning?

Machine Learning is the study of algorithms that enable computers to learn patterns from data and make predictions or decisions without being explicitly programmed.

Q2: Can you name four types of problems where it shines?

  • Classification (spam detection, image recognition)
  • Regression (predicting prices, temperature forecasting)
  • Clustering (customer segmentation, grouping similar items)
  • Anomaly detection (fraud detection, predictive maintenance)

Q3: What is a labeled training set?

A labeled training set is a dataset where each example comes with the correct output (label), which is used to train supervised learning algorithms.

Q4: What are the two most common supervised tasks?

  • Classification
  • Regression

Q5: Can you name four common unsupervised tasks?

  • Clustering
  • Dimensionality reduction
  • Anomaly detection
  • Association rule learning

Q6: What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?

Reinforcement Learning, as it allows the robot to learn optimal actions through trial-and-error interaction with the environment.

Q7: What type of algorithm would you use to segment your customers into multiple groups?

Unsupervised learning, specifically clustering algorithms (e.g., K-Means, hierarchical clustering).

Q8: Would you frame the problem of spam detection as supervised or unsupervised?

Supervised learning, because labeled examples of spam and non-spam emails are available for training.

Q9: What is an online learning system?

An online learning system updates the model incrementally as new data arrives, rather than retraining from scratch on the full dataset.

Q10: What is out-of-core learning?

Out-of-core learning refers to training a model on datasets too large to fit in memory, by loading and processing the data in batches.

Q11: What type of learning algorithm relies on a similarity measure to make predictions?

Instance-based or lazy learning algorithms, such as k-Nearest Neighbors (k-NN).

Q12: Difference between model parameter and hyperparameter?

Model parameters are learned from training data (e.g., weights in a neural network), while hyperparameters are set before training (e.g., learning rate, number of layers).

Q13: What do model-based learning algorithms search for?

They search for an internal representation of the data (model) that explains the relationships. They commonly use optimization strategies to minimize error and make predictions via the learned model.

Q14: Four main challenges in Machine Learning?

  • Insufficient or poor-quality data
  • Overfitting / underfitting
  • Feature selection and representation
  • Computational complexity and scalability

Q15: Model performs great on training but poorly on new data?

This is overfitting. Possible solutions:

  • Use more training data
  • Regularization
  • Simplify the model

Q16: What is a test set and why use it?

A test set is a dataset held out from training used to evaluate the model’s generalization to new, unseen data.

Q17: Purpose of a validation set?

Used to tune hyperparameters and select the best model without touching the test set.

Q18: What can go wrong if you tune hyperparameters using the test set?

It leads to overfitting to the test set, giving overly optimistic estimates of model performance.

Q19: What is cross-validation and why prefer it?

Cross-validation splits data into multiple folds to train/test iteratively, giving a more reliable estimate of model performance than a single validation set.

Part 2

Q1: What is the primary goal of the Big Picture phase in Edge AI–IoT design?

The Big Picture phase defines what to build, why it matters, and how it fits into the ecosystem. It bridges human-centered design, data reasoning, and systems engineering.

Q2: What are the four stages of the Double Diamond design framework?

  • Discover — Divergent exploration of the problem
  • Define — Converge to a clear problem
  • Develop — Divergent exploration of solutions
  • Deliver — Converge via prototyping and testing

Q3: How does Design Thinking emphasize solving problems in Edge AI & IoT?

Design Thinking uses empathy-driven observation, ideation, prototyping, and testing to create user-centered solutions while integrating iterative learning and feedback loops.

Q4: What is BizML and why is it important?

BizML is a business-driven ML methodology that ensures every ML project delivers measurable business or operational value, aligning data science, engineering, and strategy.

Q5: Name the six steps of BizML.

  1. Value — Define business impact
  2. Target — Specify prediction objectives
  3. Performance — Set KPIs and metrics
  4. Fuel — Prepare data
  5. Algorithm — Train predictive model
  6. Launch — Deploy and monitor model

Q6: How does Systems Thinking help in Edge AI/IoT projects?

Systems Thinking maps interactions among humans, devices, and cloud components, identifies leverage points and feedback loops, and integrates safety, trust, and reliability considerations.

Q7: What is Design Swigle and when is it used?

Design Swigle is a hybrid agile–human-centered framework for continuous improvement in Edge AI. It enables rapid persona creation, prototyping, and feedback-driven refinement of ML systems.

Q8: Give an example of how the "Discover" phase of Double Diamond is applied in IoT.

Conduct ethnographic studies and sensor audits to understand users and devices, identify friction points like latency or unreliable connectivity, and collect human stories showing where edge intelligence adds value.

Q9: How are feedback loops integrated across Edge AI/IoT methodologies?

  • User feedback refines design.
  • Telemetry informs system architecture adjustments.
  • Model drift triggers retraining.
  • Business KPIs realign objectives.

Q10: Why is integrating methodology, ML lifecycle, and edge deployment critical in IoT?

It ensures that solutions are human-centered, technically feasible, aligned with business goals, and maintainable at the edge with continuous learning and monitoring loops.