Skip to content

Tip Sheet: Spreadsheets

Amanda on Mona edited this page Oct 28, 2018 · 3 revisions

Every year, students ask for a guide to important spreadsheet tips and tricks. The truth is that everyone has their own favorites, but here are some of mine. LibreOffice Calc is my preferred spreadsheet tool, but most of these forumlas, functions and techniques will work just as well in Microsoft Excel or Google Sheets.

A Function For Everything

Want to know what the mortgage payment on a 30 year, 4.5% loan of $365,291 is? There's a function for that. It's =PMT(Rate, NPer, PV,FV) -- or PMT(0.045/12,30*12,365291,0). There are calculators for pretty much anything. If you find yourself saying "I wonder if there's a function that will..." go look! Tell you the name of the month that a particular date is in? Yes. The day of the week of a date? Yes. How much interest you've paid on that mortgage after five years? Yes.

LibreOffice's function reference is much clearer than Excel's and the functions are very close. So check out either documentation if you're looking for a function but not sure it exists.
LibreOffice Function Reference | Excel Function Reference

Ditching the Mouse

The best way to get faster at using spreadsheets is to quit using the mouse for basic operations like Save, Copy and Paste. If you don't already compulsively hit s every time you stop to think, start doing that now. And start using x to cut, ⌘c to copy, and v to paste.

One more keyboard shortcut to commit to memory: z or "undo" -- if you do something wonky and you aren't even sure what you just did, z.

Text to Columns

Try downloading Google's data on flu related searches. Use data > text to columns ... to break the text block into columns, and use some Spreadsheet formulas to find =MAX() values.

Filtering and Transposing

Use the auto filter button to set up filters that let you sort data by a single column.

Use paste special > transpose to shift rows into columns.

Cell references

Payscale prepared a list of CEO Salaries and median employee pay for the companies on Fortune's 100 list. They actually did the interesting calculations for you, but hypothetically if you wanted to know how all those salaries compared to the average (or mean) Fortune 100 CEO salary , you'd do something like this:

  A B C
1 Wal-Mart Stores $23,150,000.00 =B1/B101
2 Target $17,890,000.00
3 Walt Disney $31,630,000.00
4 Honeywell $32,860,000.00
5 McDonald's $9,550,000.00
...
101 Mean =AVERAGE(B1:B100)

But if you pull that formula down to populate other cells, it will auto increment, and you'll see =B2/B102 and =B3/B103 when what you really want is to keep your denominator constant. So you use $ to prevent it from incrementing: =B1/B$101 and you'll get something more like this when you paste the formula:

  A B C
1 Wal-Mart Stores $23,150,000.00 =B1/B$101
2 Target $17,890,000.00 =B2/B$101
3 Walt Disney $31,630,000.00 =B3/B$101
4 Honeywell $32,860,000.00 =B4/B$101
5 McDonald's $9,550,000.00 =B5/B$101
...
101 Mean =AVERAGE(B1:B100)

Tinker! Figure out what happens when you drag a formula to the right instead of down.

Read Up Or just read up on relative and absolute cell references. Try hitting Commandt after you add a cell reference, for extra bonus fun.

Splitting and combining cells

Data comes in all kinds of shapes and forms, and you often need to reshape it. OpenRefine or Workbench are good at that, but you can do a lot of reshaping right inside your spreadsheet. Let's say I have a list that looks something like this:

Bartlett, Sarah – Professor; Director, Urban Reporting
Beinart, Peter – Associate Professor, National Political Reporting, Opinion Writing
Glenn, Adam – Associate Professor, Interactive
Gray, Barbara – Distinguished Lecturer, Interim Chief Librarian

And I want it to look a little more like this:

Sarah Bartlett (Professor)
Peter Beinart (Associate Professor)
Adam Glenn (Associate Professor)
Barbara Gray (Distinguished Lecturer)

The first thing I want is to get first name, last name and title in their own columns. I can do that in a few different ways. The most obvious is Text to Columns...:

Data > Text to Columns... will allow you to split any data by any delimiter, more or less. So I can use as my first delimiter, save that, and then add a column and do it again on the names column using , as the delimiter. If that sounds confusing, try it (Do read the dialogs as you get to them. Don't just click willy-nilly and wonder why you don't understand!).

I should have something that looks a bit like this now:

  A B C
1 Bartlett Sarah Professor
2 Beinart Peter Associate Professor

Then, I want to string together the cells in the new order. I'm going to use a combination of strings and cell references to do that. A string is any characters, strung together.

=CONCATENATE(B1," ",A1," (",C1,")")

My strings are (a blank space), ( and ) -- note the spaces. A1 is a last name, B1 is a first name, C1 is a title. So if I want "first name" space "last name" I need to put B1 first, then a literal space (in quotes) and then A1.

If I'm happy with that, I can select the remaining cells that I'd like to apply the same formula to and use Edit > Fill > Down to extend the formula.

Concatenation can be especially useful if you need addresses in a form that an automated geocoder will recognize. Often addresses are (rightly) broken into discrete columns:

  A B C D
1 219 W 40th Street New York NY 10009
2 2900 Bedford Ave Brooklyn NY 11210

But most automated tools that can look up lat/long points from addresses expect something more like "219 W 40th Street, New York NY, 10009" -- so to get from your table to a full address you'd do something like this:

`=CONCATENATE(A1, ", ", B1," ", C1,", ", D1)

Explore! See if you can identify the quoted strings in that formula: what are they adding? What happens if you take them out? Do spaces after the comma make a difference? =CONCATENATE(A1, B1,C1,D1)

More Handy Slicing Formulas

Other formulas that are handy for slicing and re-combining cells: =find, =right, =left

Say you have a column of "borocd" values -- lots of NYC data includes exactly that. 101, 102, 103 are t he first, second and third community districts respectively in Boro 1 (Manhattan, usually). If you think of these as strings instead of numbers, it is easy to see how you want the 1 leftmost string (the boro) in one column and the two rightmost strings (the CD) in another. The formula is quite straightforward:

A B C
101 =RIGHT(A1,2) =LEFT(A1,1)
201 01 2
301 01 3
401 01 4
102 02 1
102 02 1

Try it out.

Compound formulas

We were working with a uniquely challenging CSV this week, one that had one column full of line breaks. Each cell looked something like this:

"429 127 Street New York, NEW YORK 10027 (40.813196742697244, -73.95433261286013)"

We wanted the latitude and longitude values, but Text to Columns wasn't working out, so we tried another tactic. The lat/lon pairs are each about 40 characters, so we could use =RIGHT(A1, 40) to take the last (or right-most) 40 characters. Unfortunately, some of the values are shorter than others. So what we really need to know, before we can use =RIGHT() is how many characters make up the portion that we want.

So we need to know how many characters there are in the cell, and we need the position of the ( in the cell. And then we can find out the difference -- that will tell us how many characters there are after the (. If that's already puzzling to you, stop and sketch it out. How would you figure out the number of characters to the right of something?

How do we find out how many characters are in A1? With =LEN(A1) How do we find out the position of the first (? With =FIND(A1,"(") -- you can put any character in quotes there. We'd find the first "q" with =FIND(A1,"q"). If the length is in B1 and the position in C1, we can calculate the difference with =B1-C1.

A B C D
429 127 St New York, NY 10027 (40.8, -73.95) =LEN(A1) =FIND(A1,"(") =B1-C1
1577 Coney Island Ave Bklyn, NY 11230 (40.62, -73.96) 53 39 14
225 Broadway New York, NY 10007 (40.71, -74.2) 46 33 13
5006 8th Ave Bklyn, NY 11220 (40.64, -74) 41 30 11
6809 20th Ave Bklyn, NY 11204 (40.61, -73.99) 45 31 14
787 Lafayette Ave Bklyn, NY 11221 (40.69, -73.94) 49 35 14

So we can plug D1 into our =RIGHT() formula: =RIGHT(A1,D1) to use the value that we calculated. Try that out, to make sure you understand how these formulas are each working. And then, let's try bundling them all into one formula, but substituting formulas for individual cell references. If B1 is =LEN(A1) and C1 is =FIND(A1,"(") then D1 is just =LEN(A1)-FIND(A1,"("). Basic substitution. And ... if D1 is =LEN(A1)-FIND(A1,"(") then we can just write our =RIGHT function as one series of nested functions: =RIGHT(A1,LEN(A1)-FIND(A1,"("))

PS. It turns out you can also remove line breaks using search and replace or =CLEAN(). That would have been much easier but so much less fun.

VLOOKUP

The =WEEKDAY() function is handy for finding out what day of the week a date falls on, but it just returns a number. 1 for Sunday, 2 for Monday, etc. About like this:

A B
4/15/14 2:33 =WEEKDAY(A1)
3/28/14 8:24 6
4/10/14 4:12 5
8/26/14 9:32 3

That's nice, but what day of the week is 5 again? If you're not Brazilian you don't actually think of weekdays in terms of numbers. Enter, =VLOOKUP().

Make a separate lookup table. You can put it on its own sheet, or just off the right someplace:

G H
1 Sunday
2 Monday
3 Tuesday
4 Wednesday
5 Thursday
6 Friday
7 Saturday

=VLOOKUP() takes at least three variables: the value to look for, the table to look in and the column of that table to use as the new value. The fourth variable is optional, but worth exploring if you want to get fancy. For now we'll use the first three. Our first is the value returned by the =WEEKDAY() function. Our second is a range, G1:H7 -- the range that includes our lookup table. And we want to get our new values from the second of the two columns. So our formula will start out looking like =VLOOKUP(B1,G1:H7,2) -- when you try to fill in the rest of the column, however, you're going to discover that you only want relative cell references for the first variable. Your lookup table isn't moving. So lock it in with G$1:H$7:

A B C
4/15/14 2:33 =WEEKDAY(A1) =VLOOKUP(B1,G$1:H$7,2)
3/28/14 8:24 6 Friday
4/10/14 4:12 5 Thursday
8/26/14 9:32 3 Tuesday

Try it -- see if you can reproduce this with a compound function instead of a separate column for =WEEKDAY() and then take a stab at working with a more complex lookup table. Say you actually want the values from column I instead of H:

G H I
1 Sunday n/a
2 Monday Housing
3 Tuesday Biz Dev
4 Wednesday Transpo
5 Thursday Health
6 Friday Labor
7 Saturday n/a

Pivot Tables

Far and away the most confusing and powerful feature of any spreadsheet program is the Pivot Table. A pivot table lets you reshape rows and columns. (coming soon!)

Other Good Roundups

Mary Jo Webster's Favorite Formulas (PDF) from NICAR 2011, my own writeup of =SUMPRODUCT, an utterly unintuitive tool for counting rows that match two criteria.