Blog

Explaining P-value to a non technical audience

Wikipedia defines p-value as “the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct”. Well if we give this definition, say in a presentation to a product or a business team, you’re most probably gonna receive piercing puzzled looks. One of the major…

Understanding Multicollinearity and Confounding Variables in Regression

Multicollinearity When two or more of the predictors are correlated, this phenomenon is called multicollinearity. This affects the resulting coefficients by masking the underlying individual weights of the correlated variables. This is why model weights are not equal to feature importance. Ways to deal with multicollinearity Confounding Varibales This is an extreme case of multicollinearity,…

Unnest (explode) a column of list in Pandas

In python, when you have a list of lists and convert it directly to a pandas dataframe, you get columns of lists. This may seem overwhelming, but fear not! Pandas comes to our rescue once again – use pandas.DataFrame.explode() The dataframe looks like this Let’s explode the first column alone The result looks like this…

RStudio in Docker – now share your R code effortlessly!

If you are a full time data science practitioner and have passed through the stages of starting out with the Titanic dataset and working through the various exercises in Kaggle , you would know by now that we wish real world data problems are that simple, but they are not! This post is about just one…

Loading…

Something went wrong. Please refresh the page and/or try again.