30/09/2015

About not being an expert as a data scientist and other tech stuffs

Last evening I did attend a joined Meetup from the Python User Berlin (PUB) and the Zalando Tech Event hosted by Zalando and offering talks on Natural Langage Processing (NLP). Both talks went well and gave two views on the topic: one on the state of the art of the tools for NLP using Python and a second more applied.

The discussions I add after while enjoying a club mate - la boisson des champions - were equally interesting. First of all I started discussing with a expert of NLP trying to explain why I joined this event and what was my link with NLP. In my very recent job experience at EyeEm I just touched the surface of NLP preparing data for Machine Learning (ML) using nltk together with WordNet, ImageNet. Actually I didn't do much of text analysis but batching word definition. In that experiment the text analysis will have come after this step and that's where semantic is jumping into the discussion. Because working with the word dictionary is one side of the problem: you have one word with its definition and often - at least with scientists or engineers - you are in the inverse configuration which is you having words when actually you want to extract a definition, an idea, an information... And I let you google automatic image tagging, deep learning.

After exchanging ideas and experiences about NLP I did continue seeping the offered mate with one Zalando employee. I was curious - as usual - to understand what it means to be a data scientist here. Because if the definition is very general - a data scientist works with data, we are not expert - it's interesting to see how many fields we - I'm one of those people - cover in our daily work. Using the same language - e.g. Python - we can go from signal processing, computer vision, image retrieval, NLP, how to deal with Databases - a year ago I wrote on the topic Databases and natural Langage graphs en stock - how to present your results to non expert by doing nice visualization and many more... So if we are not expert we need to be pretty fast I acquiring skills from various fields and/or use the appropriate tools.

16/09/2015

Etsy opening

Following my first exhibition of little planet pictures back in June in Morlaix - it's in Bretany - I did open a shop in Etsy a few days ago. The shop is called JeremieLittlePlanets for two reasons: one my name is Jérémie and secondly it is about my little planets images.

Have a look to the shop, like it if you like it, share it and even order images if you like them very much, they are great gift I heard.

Each images have been printed 5 times, 3 in 70x70cm and 7 in 40x40cm. Each copy is signed and numbered. Some of them have found lucky buyers, mostly in France until now in Paris, Nantes and Morlaix. But my images can't wait to travel more.

The selection of images we presented last June together with my friend Laurent from feelsen are the memories of time I had the last years in Berlin, Paris, Marseille, Szczecin, Polish coast, Reunion island forest, South of France...

14/09/2015

First marathon, sympa mais long

[in english below]

Pourquoi?
Pourquoi pas j'ai envie de dire. Même si je cours beaucoup plus régulièrement, je n'avais au départ l'idée de faire une course aussi longue. Mais voilà, on court un peu, deux-trois fois par semaine, le corps s'habitue, la tête suit et y prend du plaisir. Tout ça marche par étape. La barrière des 10km sans fatigue est franchie depuis plusieurs mois, dans le même intervalle de temps consacré au jogging les distances se rallongent. Il y a un nouveau palier à franchir.

Et voilà qu'au détour d'une réunion mémorable de spotters à Amsterdam en octobre dernier beaucoup se rendent comptent que la course à pieds est pratiquée par pas mal d'entre nous. Mais bon, on vit tous dans une ville différente et on va pas aller s'entrainer ensemble tous les weekends non plus...

Courir un marathon peu l'avait déjà fait, mais l'envie d'en courir devait être là et moi compris - d'ailleurs fun-fact-number-one, dites à quelqu'un que vous courez et il vous demandera "pour un marathon???" répondre non clôtura la discussion aussi vite.

C'est donc d'un oui massif que l'idée de se retrouver pour un courir un marathon a été reçue. On ne remerciera jamais assez Martin d'avoir lancé l'idée. Vote il y a eu et Vilnius fut choisie. J'en reviens juste, de la lumière dans les yeux et des courbatures partout ailleurs.

Fun-fact-number-two, dites à quelqu'un que vous allez faire un marathon dans les semaines qui arrivent et il vous répondra "un semi-marathon" ou "un vrai marathon?" comme si... Bref.

Un peu avant la course
La course est à Vilnius, capitale de la Lithuanie. Pour la rejoindre j'atterris d'abord à Riga où je retrouve un spotter en chef que nous appellerons Bart et une spotter en voyage que nous appellerons Gol. De Riga nous sautons dans un bus, quelques trois cents km plus loin nous arrivons à destination. On est vendredi soir et la course est dimanche. Superbe accueil des locaux, on peut difficilement faire mieux.

Pendant la course
Deux tours de piste au programme, mais une piste de 21km.

Départ groupé pour le marathon et le semi. La première boucle se passe bien pour moi, je suis dans mes temps sans aller trop vite ni trop lentement, un bon 10km/h.

La ligne d'arrivée franchie pour la première fois la route devient déserte devant nous. En gros la majorité des coureurs au départ de 9h00 étaient là pour le semi.  Une sorte de second départ, des coureurs tous les 100m et peu de gens pour applaudir dès qu'on quitte le centre ville. Un mal pour un bien car le revêtement est beaucoup plus confortable.

Peu après la moitié faite une petite douleur se présente autour de mon genou droit. La guigne ou les restes d'une mini entorse en beach-volley début aout... Je ralentis, fait un peu plus attention quand je pose mon pieds sachant qu'après deux heures déjà les jambes sont lourdes. Heureusement en chemin je retrouve un Lucas avec qui j'entre dans le parc. Nous sommes dans la forêt, de longues lignes droites. Bientôt nous dépassons la marque des 28km, distance la plus longue jamais couru en une fois par moi même. Cool! Mais c'est pas fini, il en reste 14...

De là je continue seul, obligé de faire des pauses, mini étirements, petits intervalles marchés. Si je cours seul je ne le suis pas vraiment. On est tout un groupe à courir par intermittence, se dépasser, s'arrêter, se saluer, se sourire. Mais bon, la barre des 36km vient d'être passée et le reste du parcours est dans la vieille ville, c'est jolie certes mais aussi plein de pavés qui vous pètent les pieds et les jambes, de plus ça monte...

C'est un peu dure quand les jambes n'avancent plus, même marcher est dur, même la dernière descente est dure et je n'arrive pas à m'étirer les jambes, tout est bloqué.  Mais les barres des 40km, 41km font plaisir à voir et je réussis à finir les 500 derniers mètres en courant. Les spotters cheerleaders sont bien là! Ca fait plaisir et cette bière sans alcool est la meilleure du monde!

Je rentre à l'hostel en boitant avec Martin, un Martin avec plein d'étoiles dans les yeux et moi déjà en train de penser à changer mon entrainement. En gros je dois faire plus de longue distance et apprendre à courir, manger, digérer en même temps et améliorer ma technique, réparer mon genou, bref il y a du boulot.

Mais quelle journée! On, c'est à dire deux bulgares et une lettone la plus discrète et la plus rapide des spotters marathoniens du jour, la plus jeune mais aussi la moins novice. Toujours bien de pratiquer un sport avec plus fort que soi, ce n'est que plus de motivation pour la prochaine fois.

On est lundi soir, je commence juste à sentir mes muscles se détendre. Joie.

--

[and now in English]

Why?
Why not will I answer. Even so I do jog regularly, I didn't have the idea - at the beginning of my numerous attempt to have a sporty life - to make one day such a long run. But here is the thing, you go jogging a bit, two-three times a week, the body gets used to it, the head follows and has some pleasure doing it, it goes step by step. The 10km barrier is crossed since a few months without being tired, for the same training time a longer distance is reached. A new level will be taken soon.

While we were having an amazing spotter weekend in Amsterdam last October, many people  did realize jogging was an activity shared by many of us. But we do live in different cities and we can't go jogging together every weekend... Running a marathon, some of us did it already and obviously the wish to do it was in some of us.

Fun-fact-number-one, if you tell someone you are running regularly he/she will almost instantly add "for a marathon?" and answer no will ends up the discussion.

Following the spotter weekend, someone - we will cal him Martin - propose to join the force and to register to marathon. Destinations were proposed and Vilnius won. Lithuania it will be! I'm just coming back from it, lights in my eyes and stiffness everywhere else.
 
Fun-fact-number-two, tell someone you are going to run a marathon and he/she will reply "an half-marathon, a real marathon?" just checking if...

A bit before the run
The run is in Vinius, capital of Lithuania. To reach my destination I first land in Riga where I met one of the spotter bosses - we will call him Bart - and a traveling foody expert/instafood addicted - we will call her Gol. From the capital of Latvia we jump into a bus aiming for Vilnius. It's Friday evening, run is on Sunday. Wonderful welcome of the locals, it's difficult to do it better!

During the run
Two loops only are scheduled, but each loop has 21km. Grouped start for the half-marathon and marathon. The first round is going according to plan for me, not too fast/slow, a good 10km/h pace.

The arrival line is crossed for the first time letting the road empty in front of us. We were a few thousand two hours ago, now only the marathon runners stay in the race. We left again the city center, less people to encourage us but more space and flatter road.

Then I start to feel a little pain in my right knee. Bad luck. The remains of a twisted ankle while playing beach volley-ball a month ago. I need to slow down, to be more careful how I land my foot, but after two hours your legs are getting heavy and it's difficult to control everything. Luckily it's the time I meet a Lucas with whom I run for a few kms. We enter the big park, jog into the forest and the soon cross the sign 28km, the longest distance ever ran for me. Cool! But there is still 14km to do...

From there I continue alone letting my partner at his speed. I need to stop sometimes, to do some stretching exercises, sometimes walk a bit and start again. I'm not completely alone, the remaining runners around me are pretty much like me running/walking/making faces. We recognize each other, smiling to each other. The 36km sign is passed and the rest of the path is crossing the old town, it looks nice but the road sucks, full of up and down and ugly pavement...

Now it starts to be difficult really, the legs are super heavy and sometimes don't even want to walk. In the last going down slope I can't relax my body, but the 40km have been reached! Then 41km and gathering my last energy I can finish the last 500m running for real in pain. The spotters cheerleaders are there! And the non-alcoholic beer you get at the arrival is the best drink on earth!

I go back limping with Martin to our hostel. On our way we have to time to cheer one Bart followed by one Sarunas in the 10km race, go go go! A Martin full of lights in his eyes and me thinking of my new training. Basically I need to do more long distance training, improve my technique, learn to jog/eat and digest, in the same time, fix my knee... There is room for improvement.

What a day! My flight to Berlin is taking of from Riga tomorrow, so I need to reach this city in time. We - two Bulgarians and me - are joining Anete which is driving back to Riga - one of the Latvian spotter in Riga, of the marathon runners and spotters the youngest, the most discrete, the fastest and the most  experimented of us. It's always good to practice a sport with stronger than you, to share experience, it's only motivation for the next run.

It's Monday evening, I slowly feel my muscles relax. Joy.

08/09/2015

Meetup for the human machines

I finally managed to attend the Shadow ML - for Machine Learning - meetup in Berlin yesterday evening, hosted by Amazon in their Computer Vision division in Berlin. Two talks were scheduled, one with images and a second with words. I explain.

Before pizza time
Here we learn about soft shadow removing. I liked this talk because it combined computer vision (CV) and machine learning (ML) and it's a problem I'm aware of as I'm regularly facing it when I'm post-processing my spherical panorama pictures taken under the sun - you can see my shadow in the picture.

What I remember from the hard shadow problem description is that a big part of the solution is to be able to isolate in the picture the shadow areas. What is a shadow area you may wonder? It's a part - or parts - of an image where the brightness has been drastically reduced such that they appear almost grey but there is still some color information available. Saying that we almost solve our problem: we need to find the color information in the shadow area and adjust its brightness to match the non shadowed neighbor area. You may have to operate in a different color space than RGB to keep the chromatic information undamaged and to change only the pixel brightness/luminance. A good image segmentation is an inevitable step.

For hard shadow the segmentation is an "easy" task as the transition between shadow/not-shadow areas is fast/brutal, in another word not soft. The problem with the soft transition is that is required a lot of human inputs to mask the image - in the sens of creating a mask that isolate the shadow areas from the others -  and we want to automate this task.

A solution proposed yesterday was to use machine learning in order to make your system learning about the difference image with and without shadow. The speaker talked about the problem of getting data - which is a recurrent part of machine learning problem modelisation and any other scientific problems - and how he did create his data-set: computer generated images with Maya where he could get two sets, one with shadow and another without for the same scene.

After that I got a bit lost of on what the author does when he found out where the shadow areas were. But assuming the areas have been well discriminated you still need to adjust the brightness level. From that two solutions at least: if the area is homogeneous then a simple scaling factor/function should do something, treating the background - or the area - as a texture can be helpful too especially if you plan is to use in-painting techniques. But the chosen solution is of course linked to what you want to do: preserving information in the image - then I will say no in-painting - or tricking the eye/human brain such that the image appears nice without shadows - then go for in-painting.

After pizza time
A complete different topic to follow but not less interesting. It was about text and word analysis. For an introduction you can check WordNet to have a glimpse of what that field is. But back to the second speaker, his problem was to see if we can predict an affiliation to a political party based on text analysis.

As the speaker did mention it this is/was a work in progress where the first task was to establish a usable data-set for building the classifier. The text of each party manifesto was employed for that purpose.

Once you have your classifier what you want is to evaluate it. All the interventions, talks given by the government members, parliament members are the perfect data sources to be used for that as well article from different newspapers could be feed to the system.

This work goes as well into the direction of sentiment analysis and a temporal parameter is something you want to have in such problem. Depending of who is running the country, who has the majority at the parliament the roles, the words play/use by the people representatives evolve. It might be obvious but this kind of tool can tell us how much we perceive the words, talks given by our politicians and how much they or we interpret/dream/hallucinate about different situations.

Building such system wasn't too complicated - if I got it right from the speaker(s) - and the main challenges were/are to get clean data. As for all machine learning you need clean data, in every basic or applied research actually.