I discuss the recent acquisition of Confetti AI, an education company I bootstrapped, and my lessons going through this process.
I discuss the messy state of MLOps today and how we are still in the early phases of a broader transformation to bring machine learning value to enterprises globally.
I describe the data science life cycle, a methodology for effectively developing and deploying data-driven projects.
In this fifth post in a series on how to build a complete machine learning product from scratch, I describe how to deploy our model and set up a continuous integration system.
In this fourth post in a series on how to build a complete machine learning product from scratch, I describe how to error analyze our first model and work toward building a V2 model.
In this third post in a series on how to build a complete machine learning product from scratch, I describe how to build an initial model with an associated training/evaluation pipeline and functionality tests.
In this second post in a series on how to build a complete machine learning product from scratch, I describe how to acquire your dataset and perform initial exploratory data analysis.
In this first post in a series on how to build a complete machine learning product from scratch, I describe how to setup your project and tooling.
After analyzing 1000+ Y-Combinator Companies, I discover there's a huge market need for more engineering-focused data practitioner roles.
I provide an aggregated and comprehensive list of tutorials I have made for fundamental concepts in machine learning.
I discuss market basket analysis, an unsupervised learning technique for understanding and quantifying the relationships between sets of items.
I discuss decision trees which are very powerful, general-purpose models that are also interpretable.
I discuss the k-nearest neighbors algorithm, a remarkably simple but effective machine learning model.
In which I describe commonly used techniques for evaluating machine learning models.
In which I discuss the technique of ensembling which is used to improve the performance of a single machine learning model by combining the power of several other models.
I discuss strategies such as cross-validation which are used for selecting best-performing machine learning models.
In which I give a primer on principal components analysis, a commonly used technique for dimensionality reduction in machine learning.
In which I discuss regularization, a strategy for controlling a model's generalizability to new datasets.
In which I describe the bias-variance tradeoff, one of the most important concepts underlying all of machine learning theory.
In which I discuss feature selection which is used in machine learning to help improve model generalization, reduce feature dimensionality, and do other useful things.
In which we investigate K-means clustering, a common unsupervised clustering technique for analyzing data.
We discuss support vector machines, a very powerful and versatile machine learning model.
I describe Naive Bayes, a commonly-used generative model for a variety of classification tasks.
I describe logistic-regression, one of the cornerstone algorithms of the modern-day machine learning toolkit.
I describe the basics of linear regression, one of the most common and widely used machine learning techniques.
A discussion of the most important skills necessary for being an effective machine learning engineer or data scientist.