MrBonsoir à l'internet: image processing

Affichage des articles dont le libellé est image processing. Afficher tous les articles

15/04/2015

Clash of the titans - optimization vs. machine learning

In the beginning

What is important to know about machine/deep learning problems? First remark to myself is "what are we trying to solve in general?" and then "which method/technique do we choose?" or "which approach is most appropriate to answer a given problem?".

Optimization for the people

Optimization is widely used to solve complex problems that don't have an analytic expression. But this doesn't mean that problems that have an analytic expression couldn't be solved using optimization.

Optimization relies on a provided model that "model" with reasonable efficiency a phenomenon (e.g. find the colorant combination of cyan, magenta, yellow and more for a give red, green, blue pixel) or anything you want. You may hear about derivatives, gradient, local minimum, cost function, quadratic form, linearity, non-linearity, iteration and more when you start messing around with optimization.

And it's completely possible to use optimization techniques as applied mathematics tools without knowing exactly how they work (e.g. you provide your model and the tools will perform the derivatives for you). In an engineering world you are connecting boxes, each one trying to solve a simple task taking for starting point what the previous is having for output.

Deep learning for the people

Deep learning and neural networks let you do something clever with the way to solve your problem. First of all your problem has been defined and described, but optimization techniques did not provide expected results: it's not fast enough or it's simply not working. One possibility is that your model isn't good enough or way to complex.

The simple idea is to let a system to learn about an ecosystem. To do so we let the algorithms mimicking how our brain is working. The concept of learning is very important here because it is really what we want to achieve. We want that our algorithm learns in a first step by obtaining representative parameters/weights before giving us a result. Then once the learning is finished, for a given entree and with the help of the parameters the algorithm can give us answers. For example is this image an image of a car, an elephant and this with different degrees of confidence.

A big part of the learning is to prepare the training sample. You can't just give images to the algorithm. Applied to computer vision, deep learning methods try to extract features from images in a similar way of how we human recognize information in images. This step of features extraction goes by applying multiple filtering on the images and the resulting filtered images, using convolution and tile approaches. At the end you obtain classes of features and it's very similar to the filters used for face recognition. Only difference is the features that describe a human face are now almost standard and doesn't need to computed or extracted again.

There exist competitions where for a given large database full of images and keywords, people can submit their algorithms. Pretty interesting results are obtained and as in sport faster solutions are appearing often coming with new tools to handle large databases.

Breaking the machine

Hopefully there is always something to improve. Because images can contain more than one object, you could have a bike and an elephant in the same picture. In that case what should reply our algorithm first? There is room for subjectivity here.

These algorithms have to deal with the constant stream of information we are processing, meaning that we are always learning - in theory of course because the world is full of lazy bastards which keeps the marketing and sales people happy making us predictable and therefore easy targets but I digress - and we have to find a way to give this ability to our algorithms or there is the risk for them to over-learn. I really like the metaphor of trying to make an algorithm able to forget part of what he is deep learning to be able to adjust its judgement.

The interesting problems are those that overcome the first limitations encountered. You could try to distinguish what are the elements in a picture (e.g. there is an elephant and a bike) or "simply" give to the image a score. If you take an artist, he will have the tendency not only to make the same picture but to add to its images something that defines the way he perceive the world around him, something pretty unique. A similarity factor or score can be very helpful when you are browsing a large image databases or "just" the internet.

20/01/2015

Pano for the people 2

I "closed" this project by adding some final examples. More precisely examples where something can go wrong and showing what it means in images.

One important thing to be able to create a beautiful little planet is to have made first a full spherical panorama picture. This panorama has to be presented as a rectangular image, an equirectangular image of ration 2:1, this means the virtual camera that would have taken this image has an equirectangular lens of field of view 360°. That is the requirement for my scripts and function that generate template file to control hugin. But you are free to distort any rectangle images with different ratios.

All the codes and guidelines here panoToLittlePlanet.

What does it mean to generate a little planet from the south hemisphere of full spherical panorama? The answer below:

full spherical image with only the south hemisphere visible.

the resulting little planet.

25/10/2013

Two days in vfx world - animago 2013

Thursday and Friday took place in Babelsberg (near Berlin) the Animago 2013 event. Two days of talks and presentations about movie production and more specifically cgi/special effects and other VFX. The people giving talks were coming from different production houses. Companies involved in movie production of course but also production of commercials, tv shows, tv productions.

What was interesting? Two things. First you don't need much equipment to produce/create cgi of high quality. Some computers, one or two kinect, good software and most important talented people and you are ready. Secondly it was interesting to see how big movie production (e.g. from the US) are sending scenes to be produced oversea (e.g. in Europe). A few seconds of film or shots are sent to a company. And that company will finish the "job", add explosions, destroy buildings, remove people when necessary. They may not have the resources to produce a whole movie, but some seconds are completely doable. Also it gives a bit the feeling that everything is about producing the best explosion on screen (half true) in that field. Or that too much attention is put on very little details when you know which movie it is or how long is the scene to be processed.

There were also longer talks describing the work on full movie almost, you can call them making-offs. Once talk was great, a team from a company based in Stuttgart was telling the story of their work on the movie Oblivion. And it was fascinating how much effort was put in that movie (knowing it is reasonably average sci-fy with a non-sens ending). Not only effort, but quality work. A perfect example of old fashion studio work and up to date vfx. So if you have seen this movie, the house in the sky was built in real size but hooked to the ground and the simulation of viewing above the clouds was created by re-projecting (I call it immersive cinema in case you are not following...) in every directions various panorama movies (made from shooting in Hawaii with a camera cluster). Very cool I may say, a nice mix of creative work and proper scientific work (how to record panorama movie, how to project them, how to set-up such projection system..)

I hope to be able to attend this event next year.h

23/01/2013

Tiens si me je mettais au Python?...

... me dis-je tout guilleret et plein de bonne intention la semaine dernière, et si je mettais à pythoner for real? Ceci afin de maitriser un peu d'autres outils, de lâcher un peu de leste sur Matlab, de faire un pas vers la grande famille des développeurs du dimanche, de tâter d'un langage que je trouve à priori joli à programmer et agréable à lire tout en étant relativement efficace.

C'est pas gagné est la version polie de reste calme et ne jette pas tes ordis par la fenêtre la garantie est passée. Sur la papier c'est faisable, on - car nous sommes deux - a déjà un truc qui tourne sur Matlab, on sait ce qu'on veut faire et par petit bout tout ce qu'on veut faire à priori fonctionne. En gros on a un flux vidéo à récupérer, à traiter, des images à créer, la webcam à calibrer, des bouts d'images à selectionner, à extraire, des différences de couleur et/ou d'intensité à calculer... rien d'insurmontable.

Niveau outils qu'avons nous? Et bien python 2.7, numpy, scipy et matplotlib pour faire "à la Matlab" et surtout openCV pour récuper facilement un flux vidéo. Sur mon ordi du boulot avec win tout va bien, même Eclipse et Pydev tournent. Par contre sur mon Mac c'est la merde et je ne comprend rien à l'installation d'openCV et surtout de faire admettre à mon python que oui openCV est là. Le pire c'est qu'il apparait dans ma liste de package, mais à chaque import cv ca fout tout en l'air.

Bref après quelques jours de recherche de tutoriels c'est encore confus. Il y a plusieurs façon de gérer des images, de les lire, les afficher... ce qui est bien, il n'y a jamais qu'une solution. Là où je ne suis pas doué est que je n'arrive pas à trouver d'information convenable et me casse les dents à essayer de traduire du Matlab (probablement une erreur de ma part). La gestion des images à N canaux pour N > 1 n'est normalement pas trop compliqué avec ndimage, mais c'est oublié mon handicape informatique et une certaine poisse digitale. Ce qui me fait penser que j'ai pas encore essayer le jouer avec des images GIS et python.

Donc je commence à y voir plus clair même si la simple création d'image à valeur constante (e.g. une image complètement rouge) m'a bien pourri la journée.

MrBonsoir à l'internet