Intro:
Documentation is often overlooked while developing software specially if the methodology is agile. Agile emphasizes working software over comprehensive documentation. But if the software development is related to a machine learning problem then documentation to support the development process is key.
Documentation Assets Inventory:
Phase | Documentation | |
---|---|---|
Inspiration / Problem Identification | Business Objectives, Business Success Criteria (Define experiment(s) to validate hypothesis) | |
Requirements, Assumptions, and Constraints | ||
Risks and Contingencies Terminology | ||
Costs and Benefits | ||
Data Mining Goals and Data Mining Success Criteria | ||
Project Plan, Initial Assessment of Tools and Technique | ||
EDA and Data Engineering | Data Exploratory Report and Data Description Report | |
Data Quality Assessment Report | ||
Data strategy (Rationale for Inclusion / Exclusion) for the model | ||
Data engineering design(cleansing, transformation rules) | ||
Feature Engineering & ML Models | Algorithm cheat sheet: show algorithms for different use cases (Document selection and reason) | |
Link farm to research papers and relevant external resources | ||
Test Design and evaluation criteria | ||
Model evaluation and approval for validating hypothesis | ||
Assessment of Data Mining Results w.r.t. Business Success | ||
Actual Code | ||
Operationalise | Continous Monitoring and Maintenance Plans | |
Troubleshooting guide for performance and testing techniques | ||
MLOps design | ||
Operations Guide and Configuration scripts for API |
The Building Blocks of a Successful Machine Learning Project: Deliverables and Documentation
- Machine learning models are built using data, and it's crucial that the data and methods used to build the model can be replicated by others. Documentation helps to ensure that the work done in the project is repeatable and reproducible.
- Facilitate better collaboration between the team working on the machine learning problem. It also ensures that everyone is on the same page regarding the scope, goals, and methods used in the project.
- By documenting the data sources, preprocessing steps, modeling techniques, and evaluation metrics, it's easier to make changes and improvements to the model as needed. Documentation helps to make it easier to maintain the machine learning model over time.
- Documentation makes it easier for stakeholders and users to understand how the model works, what data it's based on, and how accurate it is. Documentation helps to ensure that the machine learning model is transparent and understandable.
- Documentation helps to ensure that the machine learning project is compliant with relevant regulations and standards
Overall, documentation is an essential part of any machine learning project. It helps to ensure that the project is well-planned, well-executed, and well-documented, and that it can be easily maintained and scaled in the future.