It is June and nearly half of the year is over, marking the middle between Christmas 2018 and 2019. Last year in autumn, I’ve published a blog post about predicting Wham’s „Last Christmas“ search volume using Google Trends data with different types of neural network architectures. Of course, now I want to know how good the predictions were, compared to …
Simulating the bias-variance tradeoff in R
In my last blog post, I have elaborated on the Bagging algorithm and showed its prediction performance via simulation. Here, I want to go into the details on how to simulate the bias and variance of a nonparametric regression fitting method using R. These kinds of questions arise here at STATWORX when developing, for example, new machine learning algorithms or …
Optimising your R code – a guided example
Do you want to optimise your code but don’t know where to start? In this post I guide you through my thought process when I optimised my code.
Coding Random Forests in 100 lines of code*
In our series of explaining method in 100 lines of code, we tackle random forest this time! We build it from scratch and explore it’s functions.
Using Reinforcement Learning to play Super Mario Bros on NES using TensorFlow
Could you #BeatTheAI? We let deep learning have a go at Super Mario’s first level and compared it to human players. Here we explain how we did it!
6 myths about refuelling – tackled with statistics
Is it cheaper to fill up your gas tank in the evening? Many car drivers have their own theories and myths about refuelling. Our colleague Jakob tackled 6 of these myths using statistics in his latest blog post.
Automated creation of Docker containers
In this blog post, we focus on automated bash/shell scripts to create docker containers. We showcase its usage with an R-shiny example.
How to Speed Up Gradient Boosting by a Factor of Two
Our latest tool development at STATWORX: random boost, an algorithm twice as fast as gradient boosting, with comparable prediction performance.
R and Python: Using reticulate to get the best of both worlds
We at STATWORX use mostly R or Python for our projects. But why not both? With the help of the reticulate package we can use Python within R. Here we show an example of how to train a Support Vector Machine.
Fixing the most common problem with Plotly histograms
In today’s blog post, we show you how to improve the interactivity of Plotly histograms with automatically new rebinning.