In our next MünsteR R-user group meetup on Tuesday, August 28th, 2018 Jenny Saatkamp will give a talk titled Blog Mining: Deriving the success of blog posts from metadata and text data. You can RSVP here: http://meetu.ps/e/F7zDN/w54bW/f In our next MünsteR Meetup, Jenny Saatkamp will present her Blog Mining analysis, which is based on 1.500 blog posts from the codecentric company blog (https://blog.codecentric.de/) and makes use of different mining techniques for metadata and text data.

Continue reading

This is code that will encompany an article that will appear in a special edition of a German IT magazine. The article is about explaining black-box machine learning models. In that article I’m showcasing three practical examples: Explaining supervised classification models built on tabular data using caret and the iml package Explaining image classification models with keras and lime Explaining text classification models with xgboost and lime

Continue reading

This is code that will encompany an article that will appear in a special edition of a German IT magazine. The article is about explaining black-box machine learning models. In that article I’m showcasing three practical examples: Explaining supervised classification models built on tabular data using caret and the iml package Explaining image classification models with keras and lime Explaining text classification models with xgboost and lime

Continue reading

After posting my short blog post about Text-to-speech with R, I got two very useful tips. One was to use the googleLanguageR package, which uses the Google Cloud Text-to-Speech API. And indeed, it was very easy to use and the resulting audio sounded much better than what I tried before! Here’s a short example of how to use the package for TTS: Set up Google Cloud and authentification You first need to set up a Google Cloud Account and provide credit card information (the first year is free to use, though).

Continue reading

These are the slides from my workshop: Introduction to Machine Learning with R which I gave at the University of Heidelberg, Germany on June 28th 2018. The entire code accompanying the workshop can be found below the video. The workshop covered the basics of machine learning. With an example dataset I went through a standard machine learning workflow in R with the packages caret and h2o: reading in data exploratory data analysis missingness feature engineering training and test split model training with Random Forests, Gradient Boosting, Neural Nets, etc.

Continue reading

Text-to-speech with R

Computers started talking to us! They do this with so called Text-to-Speech (TTS) systems. With neural nets, deep learning and lots of training data, these systems have gotten a whole lot better in recent years. In some cases, they are so good that you can’t distinguish between human and machine voice. In one of our recent codecentric.AI videos, we compared different Text-to-Speech systems (the video is in German, though - but the text snippets and their voice recordings we show in the video are a mix of German and English).

Continue reading

Last week I published a blog post about how easy it is to train image classification models with Keras. What I did not show in that post was how to use the model for making predictions. This, I will do here. But predictions alone are boring, so I’m adding explanations for the predictions using the lime package. I have already written a few blog posts (here, here and here) about LIME and have given talks (here and here) about it, too.

Continue reading

Author's picture

Dr. Shirin Glander

Biologist turned Bioinformatician turned Data Scientist

Data Scientist

Münster, Germany