An end-to-end exploratory data project using R and Python

Image by Author


“Let’s order Thai.”

“Great, what’s your go-to dish?”

“Pad Thai.”

This has bugged me for years and is the genesis for this project.

People need to know they have other choices aside from Pad Thai. …

Pitfalls to avoid for new data scientists

Image by Andrew Ridley on Unsplash

Build a project portfolio.

Arguably the most pervasive advice in data science.

Listening to the excellent Build a Career in Data Science Podcast, I was surprised to learn few people heed this advice.

A portfolio showcases your interests, skills and abilities to reason about data. It can convince a hiring…

Let’s do better


“Let’s order Thai.”

“Great, what’s your go-to dish?”

“Pad Thai.”

This has bugged me for years.

Pad Thai shouldn’t be your first choice of Thai food.

Like Turkey on Thanksgiving, most Pad Thai is overrated. Instead of a bang, it’s a whimper.

There, I said it.

Pad Thai was created…

Using R to visualize disparities in student debt and college attainment

Data suggests student debt bites twice.

First, stalling wealth creation.

Second, if it prevents people from finishing college, this further sets back wealth creation.

Previously, I examined differences in college degree attainment, between White, Black and Hispanic Americans.

Image by Author

The Widening Gap [1] lead to a hypothesis:

Wealth inequality is positively…

In light of recent euphoria, here’s a compelling bear case.

Photo by Michael Dziedzic on Unsplash

I’m bullish Bitcoin and Ethereum.

And any technology to redistribute power, resist censorship and preserve privacy.

In light of the current crypto euphoria, I’d like to entertain the best bear case I’ve heard. Paraphrasing Demetri Kofinas, host of Hidden Forces:

Using R and Python to visualize the relationship between Market Cap and Hourly Cost to Attack

Image by Author


In this post, I use Python and R to access, parse, manipulate, then visualize data from to show the strong relationship between Market Capitalization and Cost to Attack among public crypto networks.


Rule-based Sentiment Analysis Using Python and R

Image by Author


Why Sentiment Analysis?

NLP is subfield of linguistic, computer science and artificial intelligence (wiki), and you could spend years studying it.

However, I wanted a quick dive to a get an intuition for how NLP works, and we’ll do that via sentiment analysis, categorizing text by their polarity.

We can’t help but feel…

Use R to find out which metrics drive people to click on your profile

Image by Author

Overview & Setup

This post uses various R libraries and functions to help you explore your Twitter Analytics Data. The first thing to do is download data from The assumption here is that you’re already a Twitter user and have been using for at least 6 months.

Once there, you’ll click on…

Using code to develop a feel for how machine learning optimization works

Photo by Fineas Anton on Unsplash


In this post, we’ll explore Gradient Descent from the ground up starting conceptually, then using code to build up our intuition brick by brick.

While this post is part of an ongoing series where I document my progress through Data Science from Scratch by Joel Grus, for this post I…

Exploring the BBC’s Top 100 Influential Women of 2020 with interactive plots

Image by Author


This is a quick walk through of using the sunburstR package to create sunburst plots in R. The original document is written in RMarkdown, which is an interactive version of markdown.

The following code can be run in RMarkdown or an R script. …

Paul Apivat

Data-Informed People Decisions

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store