Machine Learning Project
The machine learning project can be completed individual or in groups up to 3 people. The project should have available data and should involve classification (supervised learning), clustering (unsupervised learning), or regression.
Project Ideas
There are potential projects in Kaggle Datasets, UCI Datasets, Unearthed Solutions, Open Source Drilling Community, Datasets for Research, or through Data Science Competitions. Below is a list of data sets that are applicable to engineering.
- Abalone Age
- Algae Testbed ATP3
- Asteroid Hazard Classification
- Challenger O-ring Regression
- Concrete Regression
- Continuous Manufacturing - Example Solution
- Drug Classification for 5 Drugs
- Energy Efficiency Regression
- Forest Fire Burned Area Regression
- Gas Sensor Array Temperature Modulation
- Glass Type Classification
- Grid Loss Prediction
- Grid Power Outages
- Ionosphere Classification
- March Madness Bracket Prediction
- Mineral Database Regression
- Mineral Identification
- Mining Hydrosaver - Example Solution
- Moneyball Baseball Statistics
- Predict Healthy or Unhealthy Patients from Mass Spec Data
- Renewable Energy Production in California
- Semiconductor Manufacturing Classification
- Smart Grid Stability
- Solar Production and Outages
- Superconductor Properties
- Yacht Hydrodynamics Regression
The project includes an update, a project report, and a presentation.
Project Proposal
- Identify a case study that is related to a machine learning in engineering.
- Draw a diagram of the system with all features (input predictors) and labels (outputs) clearly indicated.
- Detail the parts of the project that are classification, regression, clustering, or dimensionality reduction.
- List articles (2-3) that give related results or locate authors that have worked in this area. Search online with Google Scholar, Science Direct, or other relevant search engines for scholarly work.
- List prior work or available data sets. Search online with Kaggle, Google, or other relevant repositories or search engines.
- Detail a plan for collecting and cleansing the data.
- List factors that may influence the success of the project. Where are the uncertainties and how will these uncertain factors be addressed?
- What is the timeline for the project and the anticipated final product for this project?
The purpose of the progress reports is to give intermediate check-points and review the current progress. The expectations for each progress report are discussed below.
Data Visualization, Exploration, and Cleansing
The project proposal should include a description of how you have consolidated the data and prepared it for machine learning. The proposal should also give an update on the project timeline and discuss any uncertainties. The proposal should be the draft section of the final report that includes an introduction, literature review, and data summary with visualization.
Project Progress Report
Regression, Classification, Clustering
The project progress report should include a discussion and results related to the machine learning with regression, classification, or clustering. An important part of any machine learning project is to report the results on test data that has not been used for training. The progress report should be a draft section of the final report that includes multiple machine learning strategies and performance metrics on the test data.
Optimize Hyperparameters, Validation, Deployment Performance
The project progress report should also include a discussion and results related to hyperparameter optimization, validation, and deployment performance of the machine learning application. The progress report should be a nearly completed version of the final report.
Final Project Report
The final project report should include the following elements:
- Cover letter introducing the context, significance, and contributions of the paper
- Highlights with 4-5 bullet points that summarize the main contributions
- Manuscript
- Title, authors
- Abstract
- Introduction / Literature review
- Theory / Methods
- Training and test performance
- Validation and deployment
- Discussion
- Conclusions
- References
The Research & Writing Center (3340 HBLL) is a free resource where trained consultants provide assistance on assignments. Schedule an appointment to receive writing help at all stages of the research and writing process.
Final Project Presentations
Final project presentations are presented during class time. Following each presentation, there is an opportunity for the audience to ask questions with Questions and Answers (Q+A).