In: Computer Science
You have been asked to create a project plan for the new machine learning model your company has asked you to build. List the main tasks and sub-tasks you would need to complete to create the model on AWS AND How would you measure the accuracy of the model you created?
Project Plan for an ML model
Project life cycle
Machine learning models are highly iterative. As one progresses through the ML lifecycle, there is a chance of encountering continuous iteration on a particular section until reaching a satisfactory level of performance. Then we can proceed forward to the next task. However, a project isn’t complete even after you ship the first version. There is continuous feedback from real-world interactions and there is always a need to redefine the goals for the next iteration of deployment.
Machine learning development cycle comprises the following steps:
1. Planning and project setup
· Define the task requirements
· Determine project feasibility
· Discuss general model tradeoffs (accuracy vs speed)
· Set up project codebase
2. Data collection and labeling
· create labeling documentation
· Build data ingestion pipeline
· quality of data
· Revisit Step 1 and ensure data is sufficient for the task
3. Model exploration
· Establish baselines for model performance
· Start with a simple model using initial data pipeline
· Over fit simple model to training data
· try many parallel ideas during early stages
· Find SoTA model for your problem domain and reproduce results, then apply to your dataset as a second baseline
· Revisit Step 1 and ensure feasibility
· Revisit Step 2 and ensure data quality is sufficient
4. Model refinement
· Perform model-specific optimizations
· Iteratively debug model as complexity is added
· Perform error analysis to uncover common failure modes
· Revisit Step 2 for targeted data collection of observed failures
5. Testing and evaluation
· Evaluate model on test distribution
· Revisit model evaluation metric
· Write tests for
· Input data pipeline
· Model inference functionality
· Model inference performance on validation data
· Explicit scenarios expected in production
6. Model deployment
· Expose model via a REST API
· Deploy new model to small subset of users to ensure everything goes smoothly
· Maintain the ability to roll back model to previous versions
· Monitor live data and model prediction distributions
7. Ongoing model maintenance
· Understand that changes can affect the system in unexpected ways
· Periodically retrain model to prevent model staleness
· If there is a transfer in model ownership, educate the new team
Team roles
A typical team is composed of:
· data engineer (builds the data ingestion pipelines)
· machine learning engineer (train and iterate models to perform the task)
· software engineer (aids with integrating machine learning model with the rest of the product)
· project manager (main point of contact with the client)
As asked in the question discussing briefly on Planning and project setup:
1. Planning and project setup
Defining the model task is not always straightforward. There are often many different approaches you can take towards solving a problem. The following explanation outlines the tasks and subtasks that a project plan to include as part.
Please Note: This outline of tasks is not intended to prescribe an implementation approach or methodology specifically. It is a generalized approach.
List of Deliverables
Overview of Tasks:
A project plan can be divided into these major implementation tasks. A summary of each task is provided below.
1. Task 1 - Project Initiation and Planning
2. Task 2 – System, Interface, and Data Conversion Design
3. Task 3 - System Development / Configuration
4. Task 4 – System Testing
5. Task 5 – Project Training
6. Task 6 – Deployment
7. Task 7 – Implementation Closeout
Sub-tasks and Activities by Task:
The preliminary sub-tasks and activities associated with each task are as follows: