Monotoniebedingungen können helfen den Sachverhalt besser durch Modelle darstellen zu lassen. In diesem Beitrag wird erklärt wir man solche Monotoniebedingungen in R umsetzt.
Simulating the bias-variance tradeoff in R
In my last blog post, I have elaborated on the Bagging algorithm and showed its prediction performance via simulation. Here, I want to go into the details on how to simulate the bias and variance of a nonparametric regression fitting method using R. These kinds of questions arise here at STATWORX when developing, for example, new machine learning algorithms or …
How to Speed Up Gradient Boosting by a Factor of Two
Our latest tool development at STATWORX: random boost, an algorithm twice as fast as gradient boosting, with comparable prediction performance.
Ensemble Methods in Machine Learning: Bagging & Subagging
In this blog we will explore the Bagging algorithm and a computational more efficient variant thereof, Subagging. With minor modifications these algorithms are also known as Random Forest and are widely applied here at STATWORX, in industry and academia.
Food for Regression: Mixing in Cross-Elasticities and Promotional Effects
Last time we dove deep into the world of a little salad bar just a few steps away from the STATWORX office. This time we are going to dig even deeper … well, we are going to dig a little deeper. Today's Specials are cross-elasticities and the effect of promotions. We talked so much about salads, because the situation of …
A Performance Benchmark of Different AutoML Frameworks
In a recent blog post our CEO Sebastian Heinz wrote about Google’s newest stroke of genius – AutoML Vision. A cloud service „that is able to build deep learning models for image recognition completely fully automated and from scratch„. AutoML Vision is part of the current trend towards the automation of machine learning tasks. This trend started with automation of …
Benchmarking Feature Selection Algorithms with Xy()
Feature Selection Feature Selection is one of the most interesting fields in machine learning in my opinion. It is a boundary point of two different perspectives on machine learning – performance and inference. From a performance point of view, feature selection is typically used to increase the model performance or to reduce the complexity of the problem in order to …
Food for Regression: Using Sales Data to Identify Price Elasticity
A few hundred meters from our office, there is a little lunch place. It is part of a small chain that specializes in assemble-yourself, ready-to-eat salads. When we moved into our new office a few years ago, this salad vendor quickly became a daily fixture. However, overtime, this changed. We still eat there regularly, but I am certain, if one …
Pushing Ordinary Least Squares to the limit with Xy()
Introduction to Xy() Simulation is mostly about answering particular research questions. Whenever the word simulation appears somewhere in a discussion, everyone knows that this means additional effort. At STATWORX we are using simulations as a first step to proof concepts we are developing. Sometimes such a simulation is simple, in other cases a simulation is plenty of work. Though, research …
Comparing predictions: World Cup scores
As many others too, me and some colleagues at STATWORX took part in a little betting game for the World Cup 2018. Since the group stage is over, I was wondering how well – or better – how worse my prediction was. I am comparing my result with other predictions by using the point system of the betting game. All …
- Page 1 of 2
- 1
- 2