💾 Installing MySQL on MacOS (and using it with R)

A couple of days ago I was asked to install MySQL on MacOS 10.13, and I was surprised that it was not a one-click installation, as in case of R. Unfortunately, even for me a documentation was a bit confusing, and I think it might be useful to have a guide of the installation process.

Read More

📈 Simulating Poisson process (part 2)

In previous post we discussed two common methods of Poisson process simulation. The reason why this trivial problem was of my interest is the fact that this is simplification of a larger scale problem of a classical ruin process. Let me remind that I focus on an extenssion of Cramér–Lundberg model with positive jumps, that is:

Read More

📈 Simulating Poisson process (part 1)

A couple of weeks ago a colleague of mine asked me for a help to estimate Gerber-Shiu function by Monte-Carlo methods. The function is used in ruin theory for risk processes. One can think about this function as of equialence to a moment generating function. That is if the function is known, it is easy to derive a certain measurments of interest, for instance, a ruin probability. My colleague wants to estimate this function for an extenssion of Cramér–Lundberg model that includes positive jumps (capital injections). From the first glance it seems as a trivial task, but when I started approaching it, this problem turned out to be not so easy to solve.

Read More

📊 Multinomial regression in R

In my current project on Long-term care at some point we were required to use a regression model with multinomial responses. I was very surprised that in contrast to well-covered binomial GLM for binary response case, multinomial case is poorly described. Surely, there are half-dozen packages overlapping each other, however, there is no sound tutorial or vignette. Hopefully, my post will improve the current state.

Read More

🔬 Dortmund real estate market analysis: neural networks

At every turn in a non-technical post about AI for broader audience an author deems their duty to mention a deep learning as panacea for all woes. Well, it’s not. Deep learning is just one of various models, which might or might not perform better then the other techniques. At the end of the day, in a nutshell, it’s just regular neural networks with multiple hidden layers between the input and output layers (well, it’s rather a oversimplification, but you got it right). In this post I am curious whether it’s possible for neural networks approach to beat our best model so far (GAM with response’s inverse Gaussian distribution).

Read More

🌳 Dortmund real estate market analysis: tree-based methods

In pervious posts traditional regression models were fitted to real estate data. In this post tree-based models, namely random forests and gradient boosting, are trained to predict prices of the rent. These methods typically outperform traditional regression models yielding smaller errors. Furthermore, tree-based methods are much more robust to overfitting, which makes them superior in terms of prediction. However, the main disadvantage (and the reason why there is no love in insurance industry) is difficulties with interpretability.

Read More

📐 Deming versus simple linear regression

All courses that somehow covered regression models were starting almost in the same way: given bunch of $y$’s and $x$’s points, one needs to predict a value of $y$ for a certain $x$. Sounds quite easy. Without utilizing any statistical assumptions, we can just find the line, which is in a way closest to those points (best fit line). So the model is as follows:

Read More

💸 Insurance-linked securities' data

As a part of my PhD program I have to attend the summer school organized by our department. During this summer school Prof. Braun (one of speakers) mentioned a super nice resource of catastrophe bonds (cat bonds) & insurance-linked securities (ILS). It provides the information, such as the size, the trigger etc. about most of ILS.

Read More

📣 Notes from R in insurance 2017

This year the fifth R in insurance conference was in ENSAE, Paris. The first impression was: “Wow, that’s a lot of people. Much more than the last year”, and I hope my estimation is not biased. Thanks to organizers that was, indeed, a true pleasure to be on both sides, as a presenter, as well as a speaker. I really love that unique atmosphere and mixed audience: not many conferences offer the feedback from the academic and industry perspective at the same time.

Read More

🎓 Insights from students data

Recently (well, a month ago) I had a discussion with a friend of mine about the modern tools and approaches in education. He is currently involed to the edX platform startup, and given that I am assistant at the university, we had several points to discuss.

Read More

💻 Bringing together R and Shell

I believe in our era of RStudio and interactive data analysis, R scripts rarely needed to be run from Shell. The same applies to the opposite: executing Shell commands from R is quite uncommon. However, some cases exist for which this is necessary.

Read More

📦 Installing and loading R packages

One reason of R popularity is an ocean of packages. Even though it is pretty straightforward to manage packages, there are a couple of tricks, do’s and don’ts, and other things which require a care.

Read More