Skip to content
/ AlphaB Public

Python library for rendering charts and computing statistics for A/B testing

License

Notifications You must be signed in to change notification settings

Wikia/AlphaB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AlphaB

Python library for rendering charts and computing statistics for A/B testing

Current state

This library is in the very initial state. Currently, it supports only A/B testings (two groups).

AlphaB allows you to:

  • Automatically generate charts from A/B testings
  • Compute statistics in order to confirm a statistical significance between groups

Table of content

Requirements

You can directly install all of the requirements for AlphaB by running pip install -r requirements.txt from the root of the repository.

  • Matplotlib - a library to generate charts from data sets
  • Pandas - a library providing high-performance, easy-to-use data structures and data analysis tools
  • Numpy - a library providing support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions
  • Scipy - a library used for scientific computing and technical computing
  • Pathlib - offers a set of classes to handle filesystem paths
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from pathlib import Path

How to use it

It is highly recommended to use Jupyter to perform A/B testing analysis in Python, and AlphaB is built to be used in Jupyter Notebooks.

Here is an example usage for AlphaB (this example doesn't include specifying a data set for now):

#!/usr/bin/env python3

from alphab import BucketTest
import pandas as pd


def main():
    df = pd.DataFrame()
    bucket_test = BucketTest(
        df=df,
        variable='impressions',
        group='design',
        x_axis='date',
        custom_title='Impressions by design',
        custom_ylabel='#',
        custom_day_interval=1
    )
    bucket_test.render()
    bucket_test.compute_pvalues()

Arguments

When creating a bucket test, you can specify the following arguments:

  • df - data frame to be used for the bucket test. It is recommended to group the data frame before passing it (e.g.: When doing a bucket test on the group design, you should group the data frame by design and date first)
  • variable - specifies the values on the y-axis for the chart and statistical significance check
  • group - the name of the column which the data frame is grouped by
  • x_axis (default: date) - specifies the values on the x_axis for the chart
  • custom_title (default: {variable} per {group}) - specifies the title for the chart
  • custom_day_interval (default: 1) - specifies the difference between the dates on the x-axis

For the render() method, those options can be specified to customize your chart:

  • figure_size_x (default: 12) - the width of the chart (in inches)
  • figure_size_y (default: 5) - the height of the chart (in inches)
  • line_width (default: 3) - the line width in a line chart (in points)
  • title_font_size (default: 16) - the font size of the title in the figure
  • legend_font_size (default: 14) - the font size of the legend in the figure
  • rotation (default: 30) - the rotation of the x ticks (in degrees)

In the compute_pvalues() method, you can customize the p-value used to reject a null hypothesis by adjusting the alpha value (default: 0.01). Recommended values are: 0.01, 0.05, 0.1. Read more about statistical significance and p-value here. This research paper is also a good place to start for those who want to better understand those topics.

Screenshots

A generated chart and statistical significance analysis example:

Next steps

  • Customize the number of groups that are taken into account A/B/C testings, A/B/C/D testings, A/B/C/D/E testings
  • Render charts and compute p-values for data from more than one data frame
  • Create tests for render and compute_pvalue methods
  • Handle other x_axis that date only
  • Customize names of images plt.savefig(Path(""))

How to contribute

You can contribute by forking this repository, looking through the issues of the repository, and opening a PR on your fork. Please make sure to write a clear PR description and to provide examples for how your new feature works.

Contributors

  • Brendan Le Glaunec @Ullaakut - Thank you for your support in reviewing the code and the documentation and motivating me to work on this project!
  • The method for checking statistical significance was highly inspired by the work of Paulina Gralak @Loczi94
  • Katarzyna Pudełko @bckatarzyna - Thank you for code reviews!
  • Martyna Urbanek-Trzeciak @martynaut - Thank you for code reviews!

Thanks a lot to all of you!

The creator and maintainer: Julia Jakubczak @veliona

About

Python library for rendering charts and computing statistics for A/B testing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •