Skip to content

Tutorial: Installing CSVKit

Amanda on Mona edited this page Oct 28, 2018 · 1 revision

There are two methods. One (csvkit_only) is fast and easy, but doesn't get you set up to use other python tools. The second (at the bottom) takes a bit more setup but will get you set up to do more with python than just run CSVkit.

csvkit only

CSVkit is a suite of utilities written in Python. It is available as a Python module, which means we can use one of Python's module installers to install it on your computer. OSX ships with easy_install by default, but like a lot of people, I prefer pip. But pip isn't installed by default.

As it turns out, OSX ships with easy_install. This isn't a python class or a programming class, it's a data class and all I really want is to walk you all through how you might dig in to a monster CSV file with some command line tools. For that, it really doesn't matter how you install CSVkit as long as you install it.

So here are some good options:

  1. Use sudo easy_install csvkit to install CSVkit. Then read a bit about what it does. If you secretly think you'll never touch the terminal again after this semester, this is your best route.

  2. Install Homebrew. Then do pip install csvkit at the command line. If you get an error that suggests you don't have permission to install it, try sudo pip install csvkit to install with root privileges. Try man sudo if you want to understand what the command does. If you want to explore more programming, this is probably your best option.

  3. Alternatively, you can install pip with easy_install (using sudo easy_install pip) and then install CSVkit with sudo pip install csvkit. Choosing this option won't mean you can't install Homebrew later.

What if I can't sudo?

If you don't have admin privileges on your computer you'll have a hard time following the instructions that come with most software. You can still install Python modules and plenty else without admin privileges. You just need to follow a few instructions. Your first step is to tell Python where to install packages so it doesn't try to put them in a system directory. In a text editor (like TextWrangler), create a new file.

Name the file .pydistutils.cfg (the dot at the beginning matters). If you already have a file called .pydistutils.cfg, edit it and add these lines to it:

[install]
install_lib = ~/Library/Python/$py_version_short/site-packages
install_scripts = ~/Library/Python/bin

You can confirm that you put it in the right place by doing cat ~/.pydistutils.cfg -- that should spit back exactly what you put into the file. If it says file not found, you didn't put it in your home directory.

Next you're going to make the directories that you just told Python to use, by running each of the commands below.

mkdir -p Library/Python/2.7/site-packages
mkdir -p Library/Python/bin

Then do easy_install csvkit and you should be able to install it just fine. You'll still have a hard time running it, however. You'll need to create another text file, this time called .bashrc (again, if it already exists, just add to it). In that file you're going to put:

PATH=$PATH:~/Library/Python/bin/
export PATH

And then at the terminal, type source .bashrc -- then (Finally!) try running which csvcut and you should see that it is installed at User/{your name}/Library/Python/bin

If you find that CSVkit isn't in your path next time you open Terminal.app, you probably need to open your Preferences and tell Terminal that "Shells open with: Command" where the command is probably /bin/bash.

Why? It's complicated

Python Setup

The csvkit installation instructions tell you to run pip install csvkit. They call that simple. It is simple if you have pip installed and know what it is. But if you don't, we need to take a few steps back.

csvkit is a python package, pip is a package manager for Python and python, is a programming language. So what you actually need to do is...

First: make sure python is installed.

Start by running which python or python --version -- that ought to give you a clue about whether or not you've got python running already. Some versions of OSX ship with python pre-installed.

Python-Guide has great instructions for getting started on Windows or OSX.

Jue Yang's walk through on getting setup is another good place to start.

Second: make sure pip is installed.

If you have python but not homebrew, you'll start with easy_install pip (or possibly sudo easy_install pip). If you went the homebrew route, brew install pip should do it.

Third: install csvkit.

Now try running pip install csvkit -- if you still get an error, it is time to ask for help.