I’ve recently started a big data project with Mathieu Dumoulin. We are using Mahout with Hadoop to do some machine learning with some Map Reduce in order to deal with big data the right way. We’ve found the way to test our Map Reduce code, so that’s what I present in this post.
In this post I present a paper that I wrote with Mathieu Dumoulin. We presents a fully scalable approach to improve classification by adding confidently labelable examples from a big dataset of unlabeled examples to a small original training set.
While doing a code reviews in a class of studients at Laval University, I heard that some people think that design quality is a very subjective matter. As long as they were using design patterns, the code they developped was clean. In this post, I will explain that everyone should be careful when using design patterns. While there are a lot a reasons to use them (and we should!), some patterns have drawbacks and flaws that should be considered.
In this article, I will present you Pig, a scripting language that is used with Hadoop. I had to learn Pig for a project I worked on with Mathieu Dumoulin. Pig is very similar to the SQL’s syntax, but allows one to manipulate big data in mapreduce mode quite easily.
This post presents the basics of the implementation of a reinforcement learning algorithm called Dynamic Bayes Decision Network. This algorithm, will create a robust and efficient expert system. In this post, I present its characteristics and I show the code of it’s implementation.