If you are a full time data science practitioner and have passed through the stages of starting out with the Titanic dataset and working through the various exercises in Kaggle , you would know by now that we wish real world data problems are that simple, but they are not! This post is about just one … Continue reading RStudio in Docker – now share your R code effortlessly!

# 2 minute refresher to Logistic Regression

Here's a 2 minute refresher on Logistic regression for you: Logistic Regression is used to model the outcomes of a categorical target variable Input features are scaled just as with linear regression, however result is fed as an input to the logistic function. In linear regression, coefficients are found by minimizing the sum of squared … Continue reading 2 minute refresher to Logistic Regression

# Setting up PySpark for Jupyter Notebook – with Docker

When you google “How to run PySpark on Jupyter”, you get so many tutorials that showcase so many different ways to configure iPython notebook to support PySpark, that it’s a little bit confusing. So when I finally figured out a way to do it, with the help of multiple websites, I thought I will post … Continue reading Setting up PySpark for Jupyter Notebook – with Docker