Use R to find out which metrics drive people to click on your profile

Image for post
Image for post
Image by Author

Overview & Setup

This post uses various R libraries and functions to help you explore your Twitter Analytics Data. The first thing to do is download data from analytics.twitter.com. The assumption here is that you’re already a Twitter user and have been using for at least 6 months.


Using code to develop a feel for how machine learning optimization works

Image for post
Image for post
Photo by Fineas Anton on Unsplash

Overview

In this post, we’ll explore Gradient Descent from the ground up starting conceptually, then using code to build up our intuition brick by brick.


Exploring the BBC’s Top 100 Influential Women of 2020 with interactive plots

Image for post
Image for post
Image by Author

Overview

This is a quick walk through of using the sunburstR package to create sunburst plots in R. The original document is written in RMarkdown, which is an interactive version of markdown.

Load Libraries

The two main libraries are tidyverse (mostly dplyr so you can just load that if you want) and sunburstR. There are other packages for sunburst plots including: plotly and ggsunburst (of ggplot), but we'll explore sunburstR in this post.

library(tidyverse)
library(sunburstR)

Load Data & Explore

The data is from week 50 of TidyTuesday, exploring the BBC’s top 100 influential women of 2020. …


Building intuition for statistical concepts using code

Image for post
Image for post
Cover Photo by Nasonov Aleksandr on Unsplash

Overview

This is a continuation of my progress through Data Science from Scratch by Joel Grus. We’ll use a classic coin-flipping example in this post because it is simple to illustrate with both concept and code. The goal of this post is to connect the dots between several concepts including the Central Limit Theorem, hypothesis testing, p-Values and confidence intervals, using python to build our intuition.

Central Limit Theorem

Terms like “null” and “alternative” hypothesis are used quite frequently, so let’s set some context. The “null” is the default position. The “alternative”, alt for short, is something we’re comparing to the default (null).


Using statistics to help users find your product

Image for post
Image for post
Photo by Jean-Louis Paulin on Unsplash

Overview

Itertools are a core set of fast, memory efficient tools for creating iterators for efficient looping (read the documentation here).

Itertools Permutations

One (of many) uses for itertools is to create a permutations() function that will return all possible combinations of items in a list.


From single events to distributions of events

Image for post
Image for post
Photo by Riho Kroll on Upsplash

Context

There are several posts that could serve as context (as needed) for the concepts discuss in this post including these posts on:

Distributions

In this post, we’ll cover probability distributions. This is a broad topic so we’ll sample a few concepts to get a feel for it. Borrowing from the previous post, we’ll chart our medical diagnostic outcomes.


Using probability to guide decision making during a pandemic

Image for post
Image for post
Photo by freestocks on Unsplash

note: This article presents a hypothetical situation and is not intended as medical advice.


Building on our understanding of conditional probability we’ll get into Bayes’ Theorem

Overview

This post is a in continuation of my coverage of Data Science from Scratch by Joel Grus.

Bayes Theorem

Previously, we established an understanding of conditional probability, but building up with marginal and joint probabilities. We explored the conditional probabilities of two outcomes:

Outcome 1: What is the probability of the event “both children are girls” (B) conditional on the event “the older child is a girl” (G)?

The probability for outcome one is roughly 50% or (1/2).

Outcome 2: What is the probability of the event “both children are girls” (B) conditional on the event “at least one of the children is a girl” (L)?

The probability for outcome two is roughly 33% or (1/3). …


Overview

This post is chapter 6 in continuation of my coverage of Data Science from Scratch by Joel Grus. We will work our way towards understanding conditional probability by understanding preceding concepts like marginal and joint probabilities.

Challenge

The first challenge in this section is distinguishing between two conditional probability statements.


How specific features of the Python language can be used to build tools used to describe data and relationships within data

Overview

This post is chapter 5 in continuation of my coverage of Data Science from Scratch by Joel Grus.

Image for post
Image for post

Specifically, we’ll examine how specific features of the Python language as well as functions we built in a previous post on Vectors (see also Matrices) can be used to build tools used to describe data and relationships within data (aka statistics). …

About

Paul Apivat Hanvongse

Data-Informed People Decisions

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store