Skip to main content

"Good enough" is good enough

You bought a watch that tells the correct time 90% of the time you look at it.  The other 10% of time it might give you an obviously incorrect time (say, showing 11PM during the day) or might be just slightly off to make you miss that important meeting.

After using this watch for a while you notice that its correctness depends on several factors such as the time of the day, your location in longitude (but not latitude), your viewing angle at the watch face, what you had for breakfast the previous day, and so on.  When you need to find out if you can catch the next bus, for example, you learn to first move to a location with the optimal ambient lighting, tilt your head just so, and raise your wrist at just the right speed to ensure a correct reading.

You went back to the store to exchange the watch for a better one that is correct 95% of the time.  However, it comes with a different set of quirks for getting the correct time.  You need to remember to rub your right cheek before reading the time, otherwise it could be somewhere between 5 minutes to 2 hours off.

===

For this unreliable watch, would you pay 90% of the full price of a watch that is always correct?  The story is a parody but the unreliability and the inability to manage the unreliability (controlling when/how much the error occurs) is very real when we deal with machine learning models trained through statistical methods.

Because telling time is a problem that has better solutions, you wouldn't rely on the watch as in the story for it.  But there are problems that do not seem to have better solutions (such as predicting the weather), and we are applying statistical models to more and more problems.

Even among these problems, the important thing is to recognize that statistical methods alone (and contemporary machine learning in particular) may or may not be the right approach.  There are problems where a "good enough" solution is indeed good enough.  If I am building a machine learning model for predicting the number of fallen leaves in my yard, it is probably okay if the result is only somewhat close to the correct number with wide variations of error margins.  But if I am building a machine learning model to help doctors diagnose medical conditions, I might not be willing to accept "good enough" as the quality bar.



Comments

Popular posts from this blog

Machine learning and software development - debugging

Machine learning is an impressive approach to create software.  The universal approximation theorem is often cited to establish the claim that deep learning - a branch of machine learning - is already sufficiently expressive to approximate any numerical functions.  Ignoring the impracticality of this claim, I would like to contrast how this approach of creating software is very different from the traditional approach with human software developers. There are many ways machine learning based software creation differs from the traditional approach: The requirements are specified differently; the creation process is different; the testing is done differently; the created software is debugged differently.  In this post I will focus on debugging. The testing aspect will be discussed elsewhere, but let's say that you have found a bug realized in the following form:  There is an input x whose output f(x) of f is not the expected output y. And the goal of debugging is...

Reflection

Robopsychology , a term coined by Isaac Asimov, is the ultimate form of humans' reflection about themselves.  By inspecting what one creates, one learns about oneself.  Even more so when the creation is meant to be  like the creator.

The code that writes code that writes code

I read that folks had observed that some machine learning models could be used to write code   that runs .  Let that sink in for a moment.  At the same time, be aware that biases and extensive memory have been observed in the same model. This might be considered as an implementation of automatic programming , and is definitely not the first time machine learning models are used for generating code (or strings that look like code). The model that writes code (Z) when given some inputs is itself a piece of (very very very large and complex) code (Y).  If expressed in a general purpose programming language, it would have perhaps thousands of variables and many more operations.  No human programmer wrote this code - another piece of code (X) did. Humans -> X -> Y -> Z This is now the classical scientific fiction set up where a robot make other robots (which may in turn make other robots).  In the case when Y is a neural network, X would be respo...