Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

ECEN 250 Lab3 document, Lab Reports of Machine Learning

Lab 3 detailed instructions and tips

Typology: Lab Reports

2024/2025

Uploaded on 06/26/2025

matthew-kam
matthew-kam 🇺🇸

1 document

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ECEN 250: Machine Learning for Electrical
Engineers
Lab 3
Assigned
6/5/25
Due
Prelab: NO PRELAB FOR THIS LAB
Jupyter Lab 3 notebook must be uploaded by 11:59PM on 6/11/25
Potential Points
100
Description
Lab Part 1: Clustering on Synthetic Datasets
You will be using k-means clustering to analyze a number of synthetic datasets. You will learn how to
create a clustering model, how to use that model to predict which clusters new observation belong to, plot
clusters with centroids and decision boundaries, and you will learn to optimize your clusters to improve
results.
You will need to copy the skeleton jupyter notebook from canvas into your google drive and open that
notebook in Colab. Follow the procedure from Lab 2 for getting the notebook on drive and opening it with
Colab. The notebook is: ECEN250_Lab3.ipynb
For this part follow the detailed instructions in the notebook modifying code cells, adding code cells, and
entering information in text cells as directed in the notebook.
Lab Part 2: Generating statistics for your clean blower data
In part 2 of the Lab 3 notebook, you will load your CSV that you created at the end of Lab 2 which
contains your cleaned blower data. If you did not complete Lab 2, that must be completed prior to Part 2
of Lab 3. In part 2, you will be generating statistics for features in your blower dataset, visualizing the
features, using scatter plots to examine multiple features, and examining subsets of your blower dataset.
For this part follow detailed instructions in the notebook modifying code cells, adding code cells, and
entering information in text cells as directed in the notebook.
NOTE: If you failed to properly clean your dataset in Lab 2, portions of Part 2 may not run. If that is the
case, fix those issues either by redoing portions of Lab 3 and regenerating the clean data CSV, or by
taking code from Lab 2 and adding it in at the start of Lab3 Part 2 to fix problems with your dataset.
NOTE: your data likely differs significantly from the data used to create the skeleton notebook. Watch for
cases where plot setting may need to be adjusted if your data has different ranges than the dataset used
to create the skeleton.
NOTE: If your blower data does not have a variety of values for every feature, then some of the statistics
will not be very meaningful. For instance if your blower data only includes entries for blowers that include
1 battery in the price (i.e. you have no entries for zero battery, 2 battery,..) then statistics for the mean
cost of the subset of blowers with 2 batteries will not be meaningful. Computing and presenting
meaningless statistics will allow you to complete this lab, but will make Lab 4 exceptionally difficult and
your results and grade will suffer. It is better to examine your dataset for Lab 3 and if necessary add
additional items to your list of blowers. Recall our lecture discussions on sampling plans having
samples that do not reflect the variety of values for features that we are recording will cause model
issues. You may end up adding 5 ro 10 additional blowers to improve the quality of your dataset. Doing
that in Lab 3 is better than waiting until Lab4! To add additional blowers to your dataset, you can either go
pf2

Partial preview of the text

Download ECEN 250 Lab3 document and more Lab Reports Machine Learning in PDF only on Docsity!

ECEN 250: Machine Learning for Electrical

Engineers

Lab 3

Assigned 6 / 5 /2 5 Due Prelab:^ NO PRELAB FOR THIS LAB Jupyter Lab 3 notebook must be uploaded by 11:59PM on 6 / 11 /2 5 Potential Points 100 Description Lab Part 1: Clustering on Synthetic Datasets You will be using k-means clustering to analyze a number of synthetic datasets. You will learn how to create a clustering model, how to use that model to predict which clusters new observation belong to, plot clusters with centroids and decision boundaries, and you will learn to optimize your clusters to improve results. You will need to copy the skeleton jupyter notebook from canvas into your google drive and open that notebook in Colab. Follow the procedure from Lab 2 for getting the notebook on drive and opening it with Colab. The notebook is: ECEN 250 _Lab3.ipynb For this part follow the detailed instructions in the notebook – modifying code cells, adding code cells, and entering information in text cells as directed in the notebook. Lab Part 2: Generating statistics for your clean blower data In part 2 of the Lab 3 notebook, you will load your CSV that you created at the end of Lab 2 which contains your cleaned blower data. If you did not complete Lab 2, that must be completed prior to Part 2 of Lab 3. In part 2, you will be generating statistics for features in your blower dataset, visualizing the features, using scatter plots to examine multiple features, and examining subsets of your blower dataset. For this part follow detailed instructions in the notebook – modifying code cells, adding code cells, and entering information in text cells as directed in the notebook. NOTE: If you failed to properly clean your dataset in Lab 2, portions of Part 2 may not run. If that is the case, fix those issues – either by redoing portions of Lab 3 and regenerating the clean data CSV, or by taking code from Lab 2 and adding it in at the start of Lab3 Part 2 to fix problems with your dataset. NOTE: your data likely differs significantly from the data used to create the skeleton notebook. Watch for cases where plot setting may need to be adjusted if your data has different ranges than the dataset used to create the skeleton. NOTE: If your blower data does not have a variety of values for every feature, then some of the statistics will not be very meaningful. For instance if your blower data only includes entries for blowers that include 1 battery in the price (i.e. you have no entries for zero battery, 2 battery,..) then statistics for the mean cost of the subset of blowers with 2 batteries will not be meaningful. Computing and presenting meaningless statistics will allow you to complete this lab, but will make Lab 4 exceptionally difficult and your results and grade will suffer. It is better to examine your dataset for Lab 3 – and if necessary add additional items to your list of blowers. Recall our lecture discussions on sampling plans – having samples that do not reflect the variety of values for features that we are recording will cause model issues. You may end up adding 5 ro 10 additional blowers to improve the quality of your dataset. Doing that in Lab 3 is better than waiting until Lab4! To add additional blowers to your dataset, you can either go

back to your source CSV for lab 2, add the new items to the 50 you had, and rerun the cells in the Lab 2 notebooks to create a new BlowerDataClean.csv file. [You may need to rename your old clean csv first

  • your df.to_csv() may fail if the file exists] Lab Part 3: Machine learning on your blower data In part 3 of the Lab 3 notebook, you will be using features in your blower dataset to create unsupervised clustering models. You will be looking for characteristics in the data for features and looking at how multiple features cluster in pair-wise scatter plots and in 2d clustering examples. For this part follow detailed instructions in the notebook – modifying code cells, adding code cells, and entering information in text cells as directed in the notebook. NOTE: If could have some remaining issues with where your dataset does not have sufficient range of values for some features like in part 2. Repeat those steps to add additional data entries to improve your clustering if necessary. When you have completed your notebook, be sure all cells have been run in the correct order and the results are shown, then download your notebook, rename if necessary, and upload to canvas by the due date.