Toolkit for (More) Reproducible Machine Learning Projects

A deep-in the tools to use for building a more reproducible ML project

David Beauchemin

8 minutes read

Over the past years, I’ve worked on various machine learning projects (mostly research ones), and I’ve faced numerous problems along the way that impacted the reproducibility of my results. I had to regularly (not without hating myself) take a lot of time to resolve which experiment was the best and which settings were associated with those results. Even worse, finding where the heck were my results was painful. All these situations made my work difficult to reproduce and also…

Read more

How to submit a blog post

A template to contribute to the blog .Layer

Samuel Perreault and David Beauchemin

4 minutes read

Contributing to the blog has never been easier. First of all, it must be said that any submission, whatever its format (Markdown, Microsoft Word, Notepad, name it!), will be considered, and ultimately transcribed into Markdown by our team. We offer the option to submit an article here, and we are already thinking of a way to review non-Markdown documents (possibly Google Docs). This being written, for those who would like to write and submit a post in the conventional way, here is a simple…

Read more

What's wrong with Scikit-Learn.

Scikit-Learn had its first release in 2007, which was a pre deep learning era. However, it’s one of the most known and adopted machine learning library, and is still growing.

Guillaume Chevalier

7 minutes read

Scikit-Learn’s “pipe and filter” design pattern is simply beautiful. But how to use it for Deep Learning, AutoML, and complex production-level pipelines?

Read more
Jean-Thomas Baillargeon

7 minutes read

This is probably going to sound cliché and trivial but I just realized that I learned how to use a computer before learning to use a pen. I launched my favourite game on a MS/DOS terminal a few years before writing my name on paper.

Read more

Stein's paradox and batting averages

A simple explanation of Stein's paradox through the famous baseball example of Efron and Morris

Samuel Perreault

9 minutes read

There is nothing from my first stats course that I remember more clearly than Prof. Asgharian repeating “I have seen what I should have seen” to describe the idea behind maximum likelihood theory. Given a family of models, maximum likelihood estimation consists of finding which values of the parameters maximize the probability of observing the dataset we have observed. This idea, popularized in part by Sir Ronald A. Fisher, profoundly changed the field of statistics at a time when…

Read more
dot-hockey

8 minutes read

For any die-hard hockey fan, September 13th will be remembered as the day Erik Karlsson was traded by the Sens, and not without controversy, as it seems Ottawa got less for Karlsson than they gave to acquire Duchene earlier this year. It turns out that the 4th Annual Ottawa International Hockey Analytics Conference (OTTHAC18), again held at Carleton University, was scheduled on the following weekend (September 14th and 15th, 2018). We couldn’t get there for the workshop and activities of…

Read more

The grammar of graphics

The art of communicating with graphics

Stéphane Caron

10 minutes read

Data science is a growing field that regroups talented and passionate people who generally show remarkable technical skills and outstanding ease to solve problems. However, from my personnal experience, one particular skill is often undervalued: Communication. In data science, as in many other fields, graphical vizualisation is an important tool that helps to simplify and clearly communicate results to others. For that reason, it’s crucial to master this set of skills. Just like any other…

Read more
  • Category