Building A Scorecard Web App For Residential Areas In Dubai
Mercury helps you easily convert your Jupyter notebook into a web app — Table of contents Introduction Source of data Code to create scorecard in Jupyter notebook Converting the Jupyter notebook to a Mercury web app Exploring the scorecard results Introduction Suppose we are analyzing how different residential areas in Dubai are performing based on a set of metrics we are interested in. They could be pure…
Biased Model Coefficients — (Part 1)
Attenuation bias/regression dilution biases your coefficients towards 0 — TL;DR — When you have significant noise or measurement error in your X variables, the model coefficients underestimate the impact of your variables i.e. if your coefficients are positive(negative), the true impact is even more positive(negative). About the series Suppose you are a data scientist working for a company doing business in real…
Agent-Based Model Visualization
Modeling the spread of COVID using a spatial model in Python’s Mesa library and visualization modules — Introduction In a previous article, I provide an introductory example (optimizing the number of supermarket counters) of how one can run agent based models using Python’s Mesa library. We tracked line charts of KPIs such as as the average queue length and waiting time of our customers, but we used limited…
Intro to Agent Based Modeling
An example of how agent based modeling in Python can help determine the number of counters to open at a supermarket — Table of contents: 1- Intro: why agent based modeling? 2- Our supermarket queueing problem 3- Model design 4- Model execution 5- Conclusion Introduction Agent based modeling (ABM) is a bottom-up simulation technique where we analyze a system by its individual agents that interact with each other.
Imbalanced Data — Oversampling Using Gaussian Mixture Models
Other generative models can also similarly be used for oversampling — TL;DR — Drawing samples from Gaussian Mixture Models (GMM), or other generative models, is another creative oversampling technique that can potentially outperform SMOTE variants. Table of contents Introduction Dataset preparation Intro to GMM Using GMM as an oversampling technique Evaluation of performance metrics Conclustion Introduction
Lookahead Decision Tree Algorithms
Lookahead mechanisms in decision trees can produce better predictions — TL;DR: I show that decision trees with a single-step lookahead mechanism can outperform standard, greedy decision trees (no lookahead). No overfitting or lookahead pathology is observed in the sample dataset. Why lookahead? Suppose we are trying to predict if a potential job candidate can be successful in his job.
Correlated Variables in Monte Carlo Simulations
Can sales of vanilla ice cream overtake chocolate? — Table of contents: Introduction Problem Statement Data preparation Wrong method 1 — Independent simulation (parametric) Wrong method 2 — Independent simulation (non-parametric) Method 1 — Multivariate distribution Method 2— Copulas with marginal distributions Method 3— Simulating historical combinations of sales growth Method 4— Decorrelating store sales growth using PCA
Imbalanced Data — Oversampling Using Genetic Crossover Operators
Crossover/recombination oversampling adds novelty to a dataset and can score well on classification metrics vs. SMOTE and random oversampling — TL;DR — There are many ways to oversample imbalanced data, other than random oversampling, SMOTE, and its variants. In a classification dataset generated using scikit-learn’s make_classification default settings, samples generated using crossover operations outperform SMOTE and random oversampling on the most relevant metrics.
Intro to Monte Carlo Simulation Using Business Examples
Assess probabilities of various business outcomes — Monte Carlo simulation is a computational technique that can be used for a wide range of functions such as solving some of the more difficult mathematical problems as well as risk management. We will go through 2 examples to demonstrate how Monte Carlo simulations can help you quantify risks in…