Machine Learning Project

The machine learning project can be completed individual or in groups up to 3 people. The project should have available data and should involve classification (supervised learning), clustering (unsupervised learning), regression, or dimensionality reduction.

Project Ideas

There are potential projects in Kaggle Datasets, UCI Datasets, Unearthed Solutions, Open Source Drilling Community, Datasets for Research, or through Data Science Competitions. Below is a list of data sets that are applicable to engineering.

The project includes 3 updates, a final report, and a final presentation.

Project Proposal

  1. Identify a case study that is related to a machine learning in engineering.
  2. Draw a diagram of the system with all features (input predictors) and labels (outputs) clearly indicated.
  3. Detail the parts of the project that are classification, regression, clustering, or dimensionality reduction.
  4. List articles (2-3) that give related results or locate authors that have worked in this area. Search online with Google Scholar, Science Direct, or other relevant search engines for scholarly work.
  5. List prior work or available data sets. Search online with Kaggle, Google, or other relevant repositories or search engines.
  6. Detail a plan for collecting and cleansing the data.
  7. List factors that may influence the success of the project. Where are the uncertainties and how will these uncertain factors be addressed?
  8. What is the timeline for the project and the anticipated final product for this project?

The purpose of the progress reports is to give intermediate check-points and review the current progress. The expectations for each progress report are discussed below.

Project Progress Report #1

Data Visualization, Exploration, and Cleansing

The first project progress report should include a description of how you have consolidated the data and prepared it for machine learning. The report should also give an update on the project timeline and discuss any factors that were identified in the project proposal relating to uncertainties. This progress report should also include a discussion of the relevant articles that were identified in the project proposal. The progress report should be the draft section of the final report that includes an introduction, literature review, and data summary with visualization.

Project Progress Report #2

Regression, Classification, Clustering, Dimensionality Reduction

The second project progress report should include a discussion and results related to the machine learning with regression, classification, clustering, or dimensionality reduction. An important part of any machine learning project is to report the results on test data that has not been used for training. The progress report should be a draft section of the final report that includes multiple machine learning strategies and performance metrics on the test data.

Project Progress Report #3

Optimize Hyperparameters, Validation, Deployment Performance

The third project progress report should include a discussion and results related to hyperparameter optimization, validation, and deployment performance of the machine learning application. The progress report should be a nearly completed version of the final report.

Final Project Report

The final project report should include the following elements:

  1. Cover letter introducing the context, significance, and contributions of the paper
  2. Highlights with 4-5 bullet points that summarize the main contributions
  3. Manuscript
    1. Title, authors
    2. Abstract
    3. Introduction / Literature review
    4. Theory / Methods
    5. Training and test performance
    6. Validation and deployment
    7. Discussion
    8. Conclusions
    9. References

The Research & Writing Center (3340 HBLL) is a free resource where trained consultants provide assistance on assignments. Schedule an appointment to receive writing help at all stages of the research and writing process.

Final Project Presentations

Final project presentations are 5 minutes each and are pre-recorded. The final project presentations will be presented during the final exam time (3 hrs) with a webinar link for remote participants. Following each presentation, there is an opportunity for the audience to ask questions with 5 minutes of Questions and Answers (Q+A).