Last time I was doing image processing in C I was doing quantization of the colorspace using the technique out of a paper from 1982. Just because a source is old doesn't mean it is wrong.
It doesn't mean it's wrong. Like partial derivatives are centuries old. But we are moving on. Just recently I've seen a book on image recognition. While recent it was focused on math and not a word about NN. That looked so outdated. Someone spent the whole carer on this way. Another example would be linguistic approach to text translation. 'Stupid' LLM does it way better today. I can only guess how pro linguists feel about it.
Yes, those methods are old, but they’re explainable and much easier to debug or improve compared to the black-box nature of neural networks. They’re still useful in many cases.
Only partially. The chapters on edge detection, for example, only have historic value at this point. A tiny NN can learn edges much better (which was the claim to fame of AlexNet, basically).
That absolutely depends on the application. "Classic" (i.e. non-NN) methods are still very strong in industrial machine vision applications, mostly due to their momentum, explainability / trust, and performance / costs. Why use an expensive NPU if you can do the same thing in 0.1 ms on an embedded ARM.
A NN that has been trained by someone else on unknown data with unknown objectives and contains unknown defects and backdoors can compute something fast, but why should it be trusted to do my image processing?
Even if the NN is built in-house overcoming trust issues, principled algorithms have general correctness proofs while NNs have, at best, promising statistics on validation datasets.
This doesn’t match my experience. I spent a good portion of my life debugging SIFT, ORB etc. The mathematical principles don’t matter that much when you apply them; what matters is performance of your system on a test set.
Turns out a small three-layer convnet autoencoder did the job much better with much less compute.
You cannot prove that an algorithm does what you want, unless your understanding of what you want is quite formal.
But you can prove that an algorithm makes sense and that it doesn't make specific classes of mistake: for example, a median filter has the property that all output pixel values are the value of some input pixel, ensuring that no out of range values are introduced.
Few customers care about proofs. If you can measure how well the method work for the desired task, that is most cases sufficient and in many cases preferred over proofs.
For hobbyists that's enough, for engineers often okay (I find myself in that situation) but for scientists "good enough" means nothing.
Optical metrology relies on accurate equations how a physical object maps to the image plane so in that case analytical solutions are necessary for subpixel accuracy.
I'm worried about how often kids these days discount precise mathematical models for all use cases. Sure, you get there most of the time but ignore foundational math and physics at your own peril.
Self-driving cars are a political problem, not a technical one. Our roads don't work especially well for human drivers, so I don't know why anyone expected machines to achieve perfection.
The claim that "a tiny NN can learn edges better" is misleading. Classical algorithms like Canny or Sobel are specifically designed for edge detection, making them faster, more reliable, and easier to use in controlled environments. Neural networks need more data, training, and computational power, often just to achieve similar results. For simple edge detection tasks, classical methods are typically more practical and efficient.
For clean, classical edges in standard images, classical algorithms(Canny, Sobel) are hard to beat in terms of accuracy, efficiency, and clarity.
For domain-specific edges (e.g., medical images, low-light, or noisy industrial setups), a small, well-trained neural network may perform better by learning more complex patterns.
In summary, "Garbage In, Garbage Out" applies to both classical algorithms and neural networks. Good camera setup, lighting, and optics solve 90% of machine vision or computer vision problems before the software—whether classical or neural network-based—comes into play.
If you're looking for simpler, faster, and reliable edge detection algorithms, traditional methods like Canny, Sobel, Prewitt, and Laplacian are excellent choices.
For real-time applications or resource-constrained systems, simpler methods like Roberts or Prewitt may be the best.
However, if you need more robustness against noise or better edge accuracy, Canny or Scharr are preferred.
Yes, it’s part of the process of data augmentation, which is commonly used to avoid classifying on irrelevant aspects of the image like overall brightness or relative orientation.
I see it the same way I see 'Applied Cryptography'. It’s old C code, but it helps you understand how things work under the hood far better than a modern black box ever could. And in the end, you become better at cryptography than you would by only reading modern, abstracted code.
You might find this interesting as well:
https://www.spinroot.com/pico
2000-2003, both are pre-historic. We have neural networks now to do things like upscaling and colorization.
Last time I was doing image processing in C I was doing quantization of the colorspace using the technique out of a paper from 1982. Just because a source is old doesn't mean it is wrong.
It doesn't mean it's wrong. Like partial derivatives are centuries old. But we are moving on. Just recently I've seen a book on image recognition. While recent it was focused on math and not a word about NN. That looked so outdated. Someone spent the whole carer on this way. Another example would be linguistic approach to text translation. 'Stupid' LLM does it way better today. I can only guess how pro linguists feel about it.
Yes, those methods are old, but they’re explainable and much easier to debug or improve compared to the black-box nature of neural networks. They’re still useful in many cases.
Only partially. The chapters on edge detection, for example, only have historic value at this point. A tiny NN can learn edges much better (which was the claim to fame of AlexNet, basically).
That absolutely depends on the application. "Classic" (i.e. non-NN) methods are still very strong in industrial machine vision applications, mostly due to their momentum, explainability / trust, and performance / costs. Why use an expensive NPU if you can do the same thing in 0.1 ms on an embedded ARM.
A NN that has been trained by someone else on unknown data with unknown objectives and contains unknown defects and backdoors can compute something fast, but why should it be trusted to do my image processing? Even if the NN is built in-house overcoming trust issues, principled algorithms have general correctness proofs while NNs have, at best, promising statistics on validation datasets.
This doesn’t match my experience. I spent a good portion of my life debugging SIFT, ORB etc. The mathematical principles don’t matter that much when you apply them; what matters is performance of your system on a test set.
Turns out a small three-layer convnet autoencoder did the job much better with much less compute.
You cannot prove that an algorithm does what you want, unless your understanding of what you want is quite formal. But you can prove that an algorithm makes sense and that it doesn't make specific classes of mistake: for example, a median filter has the property that all output pixel values are the value of some input pixel, ensuring that no out of range values are introduced.
Few customers care about proofs. If you can measure how well the method work for the desired task, that is most cases sufficient and in many cases preferred over proofs.
For hobbyists that's enough, for engineers often okay (I find myself in that situation) but for scientists "good enough" means nothing.
Optical metrology relies on accurate equations how a physical object maps to the image plane so in that case analytical solutions are necessary for subpixel accuracy.
I'm worried about how often kids these days discount precise mathematical models for all use cases. Sure, you get there most of the time but ignore foundational math and physics at your own peril.
Classical CV algorithms are always preferred over NNs in every safety critical application.
Except Self driving cars, and we all see how that's going.
Self-driving cars are a political problem, not a technical one. Our roads don't work especially well for human drivers, so I don't know why anyone expected machines to achieve perfection.
It is much harder to accept anything less than perfection from a machine when human life is in the equation.
I can forgive a human.
The claim that "a tiny NN can learn edges better" is misleading. Classical algorithms like Canny or Sobel are specifically designed for edge detection, making them faster, more reliable, and easier to use in controlled environments. Neural networks need more data, training, and computational power, often just to achieve similar results. For simple edge detection tasks, classical methods are typically more practical and efficient.
For clean, classical edges in standard images, classical algorithms(Canny, Sobel) are hard to beat in terms of accuracy, efficiency, and clarity.
For domain-specific edges (e.g., medical images, low-light, or noisy industrial setups), a small, well-trained neural network may perform better by learning more complex patterns.
In summary, "Garbage In, Garbage Out" applies to both classical algorithms and neural networks. Good camera setup, lighting, and optics solve 90% of machine vision or computer vision problems before the software—whether classical or neural network-based—comes into play.
"The chapters on edge detection, for example, only have historic value at this point"
Are there simpler, faster and better edge detection algorithms that are not using neural nets?
If you're looking for simpler, faster, and reliable edge detection algorithms, traditional methods like Canny, Sobel, Prewitt, and Laplacian are excellent choices.
For real-time applications or resource-constrained systems, simpler methods like Roberts or Prewitt may be the best.
However, if you need more robustness against noise or better edge accuracy, Canny or Scharr are preferred.
I wonder if doing classical processing of real-time data as a pre-phase before you feed into NN could be beneficial?
Yes, it’s part of the process of data augmentation, which is commonly used to avoid classifying on irrelevant aspects of the image like overall brightness or relative orientation.
I see it the same way I see 'Applied Cryptography'. It’s old C code, but it helps you understand how things work under the hood far better than a modern black box ever could. And in the end, you become better at cryptography than you would by only reading modern, abstracted code.
310 pages of text, 500 pages of C code in the appendix - this could need a supplemental github page.
The source code is at https://github.com/Dwayne-Phillips/CIPS
Nice reference! The URL in the preface is dead.