Confusing terminology

I’m working in the intersection of image processing, medicine and machine learning. During my earlier career as a management consultant working in many different kinds of industries I came to learn the importance of a common terminology. A terminology (fancier words are ontology and conceptual model) describes the concepts that words used in the particular context denote and the relationship between those concepts. As an example, if we don’t agree on the meaning of the words quality and product in our organization then most likely confusion and bad decisions will follow.

Sometimes it is not so easy to define a terminology, especially if we try to establish the terminology ex nihilo. See an earlier post for an example.

A couple of terms from the deep learning community that have bugged me all along are tensor and validation set.

I was for some time looking for the “tensorness” of machine learning tensors, that is some notion of particular transformation rules or other tensor semantics. I suspected that there was something sophisticated about these data structures that I had missed. Eventually I came to realize that they are just multidimensional arrays with semantics associated with their particular use (weight data, input image data, activation data, mini-batch data,…). Tensors are in fact very similar to ndarrays in numpy, so much so that when going back and forth between PyTorch tensors and ndarrays, they end up sharing the same memory location.

Calling arrays tensors is a bit like calling bias values temperature. Both bias and temperature are represented by scalar numbers but they have entirely different conventional semantics and saying that “the temperature of this neuron is 5” would be rather confusing until everybody understood that temperature is the weight of the input with a constant input value of one. Also see this post about the differences (and similarities) between matrices and tensors.

Validation in software engineering as well as in the medical device industry refers to a test of a product in a real (in the case of medical devices, clinical) environment. The purpose of validation is to find out whether whether the product works in the real world; whether our product satisfies the right requirements (the purpose of verification, a related term, is to see if the product satisfies the requirements we have actually written down, be they correct or not). So the word validation in validation set in the deep learning literature is not even close to what is meant by validation in established industries. The alternative term development set is not as bad but doesn’t really say anything about the specific use of that data set. In our organization we have adopted the term tuning set instead as we believe this name best reflects what it is being used for; to tune the model. The test set in deep learning is what we would call a verification (data) set.

I’m not sure why the deep learning community has adopted these, for the rest of us, confusing terms. As a minimum we need proper translations to terms used in established industries for a long time.

Leave a Reply

Your email address will not be published. Required fields are marked *