In the latest episode of Collaborative Craft, I met with principal crafters Myles Megyesi and Pierce Edmiston about their experiences applying machine learning (ML) to a real production system.
Throughout our chat, I try to break apart the mystery and the various misconceptions surrounding ML by looking at how it’s implemented in real-world, production-level applications. In each case, we can see the value of having data scientists and engineers working together on machine learning projects, because all of the usual rules and best practices for writing quality code still apply.
With machine learning becoming a buzzword that’s seemingly all over the map, I wanted to know what problems ML is actually meant to solve.
There's a lot that can be solved with basic statistics in the linear model, fitting a line to a series of points, and getting the slope of that line, and, you know, being able to predict points on that line.
You can get pretty far with that. But I think machine learning, the way it's talked about today, usually involves more complicated relationships than you can fit with a line. And that tends to come when you have large data sets. Although we want all of our data sets to be sort of normally distributed, we often run into real use cases with very long tails.— Pierce Edmiston, Principal Crafter
I also wanted to know where we should draw the line on machine learning’s potential.
I would say it would be useful to distinguish in machine learning the idea of doing some sort of predictive process or automation process as being separate from something that is geared toward generating insight. Because I don't really think about machine learning as being something that generates insight necessarily.— Pierce Edmiston, Principal Crafter
Pierce and Myles also detailed their experience developing a system where they weren’t sure whether ML would be the right tool for their problem.
At the beginning of this project, we didn't know whether this ML thing was gonna work. We didn't really want to spend a year in R&D building something out and then have nothing. We wanted to be able to have something at the end of this that worked regardless of whether the ML was successful or not. That was where we took the approach to build out the workflow product first, and then the ML is just gonna be this slight optimization on top.
So at the very worst, the client would still have … a workflow to get the information off of these documents in a repeatable, consistent way that could scale to thousands of documents and still be successful. Thankfully, it did work out.— Myles Megyesi, Principal Crafter
The duo also shared their experiences working with machine learning, and what they’d say to developers who are just getting started with ML.
There are nuances and there's expertise that can help you along the way, but the practices don't change. The principles that we hold dear do not change. You know, the same rigor that we apply to developing websites, to developing APIs, to developing anything applies equally to ML and data engineering.— Myles Megyesi, Principal Crafter
Given the maturity of some of the machine learning tools that are out there being open source, I do think that software engineers should become more comfortable approaching these machine learning frameworks in the same way that they would approach learning a new web framework or a new programming language.
There are vibrant communities out there, there's good documentation, and you can learn it just like you learn anything else.— Pierce Edmiston, Principal Crafter
SUBSCRIBE TO COLLABORATIVE CRAFT
If you'd like to receive new episodes as they're published, please subscribe to Collaborative Craft in Apple Podcasts, Google Podcasts, Spotify, or wherever you get your podcasts. If you enjoyed this episode, please consider leaving a review in Apple Podcasts. It really helps others find the show.
Learn more about Collaborative Craft and discover other episodes from our archive.