Skip to content

🐍 A simple Python (2 or 3) script to generate a PNG word-cloud ☁️ image from a bunch of πŸ“‚ text files πŸŽ‰. Based on word_cloud by @amueller.

License

Notifications You must be signed in to change notification settings

Naereen/generate-word-cloud.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

generate-word-cloud.py

A simple Python 🐍 script to generate a square wordcloud ☁️ from one (or more) text file(s). Supporting both Python 2 and 3 (2.7+ and 3.4+). generatewordcloud in pypi

generate-word-cloud example meta

Based on the great word_cloud module by @amueller.

PyPI version PyPI license PyPI format PyPI pyversions PyPI implementation PyPI status


How to use it?

The usual module matplotlib is needed for the plotting, docopt is needed for the command line interface, and word_cloud is needed for the actual work (generating the cloud of words after reading the files).

The required Python (2 or 3) modules can be installed with pip, either directly:

# Directly:
sudo pip install matplotlib docopt word_cloud

Or with the requirements.txt file:

sudo pip install -r requirements.txt

Note: if ansicolortags is available, it will be used to print nice colors in the help and during the generation of word clouds.

2. Installation

Clone the repository, copy the script (generate-word-cloud.py) somewhere in your PATH (e.g., ~/.local/bin/).

You can also just download the script itself:

$ wget https://raw.githubusercontent.com/Naereen/generate-word-cloud.py/master/generate-word-cloud.py
$ cp generate-word-cloud.py /path/to/a/directory/in/your/PATH/

Note: The script is also available from PyPI : pypi.python.org/pypi/generatewordcloud. You can install it using pip.

$ pip install generatewordcloud
$ # Or maybe you need sudo rights:
$ sudo pip install generatewordcloud

PyPI version PyPI license PyPI format PyPI pyversions PyPI implementation PyPI status


3. Usage

Help:

$ generate-word-cloud.py --help

From one or two files

Generate a wordcloud from two txt files in the current directory, save it to wordcloud_txt.png.

$ generate-word-cloud.py -o ./wordcloud_txt.png ./file1.txt ./file2.txt

Generate a wordcloud from the textfile hamlet.txt (~ 8000 lines), saving to hamlet.png:

$ generate-word-cloud.py -o ./hamlet.png ./hamlet.txt

generate-word-cloud example hamlet

(It should work on pretty big text files without any issue.)


Other examples

From a lot of Python scripts (~ 200) 🐍

generate-word-cloud example python

From a lot of Bash scripts (~ 150) 🐚

generate-word-cloud example bash

From a lot of LaTeX files (~ 180) πŸ†

generate-word-cloud example LaTeX

🎨 Meta example

Generate a wordcloud from the README.md and generate-word-cloud.py files of this very project, save it to wordcloud_meta.png!

$ generate-word-cloud.py -o ./wordcloud_meta.png ./*.md ./*.py

generate-word-cloud example meta


Features

  • Support one or more input file(s), will cleanly skip any file it fails to find or fails to read,
  • Custom output file, won't be overwritten (except with -f flag),
  • Nice command line interface (argparse powered). I switched to docopt after realizing how awesome it is!
  • Has a command line option for every important parameter (max nb of words, width, height etc).
  • Input filenames with spaces in their name were seen as several files (e.g. this file.txt), FIXED with the switch to docopt.

πŸ“ƒ Complete documentation (--help)

$ generate-word-cloud.py -h | --help
Usage:
  generate-word-cloud.py [-s | --show] [-f | --force] [-o OUTFILE | --outfile=OUTFILE]
                         [-t TITLE | --title=TITLE] [-m MAX | --max=MAX]
                         [-w WIDTH | --width=WIDTH] [-H HEIGHT | --height=HEIGHT]
                         INFILE...
  generate-word-cloud.py (-h | --help)
  generate-word-cloud.py (-v | --version)

Options:
  -h --help            Show this help message and exit.
  -v --version         Show program's version number and exit.
  -s --show            Show the image but do not save it [default False].
  -f --force           Force to write the image, even if present (default is to ask before overwriting an existing file) [default False].
  -o OUTFILE --outfile=OUTFILE
                       Filename for the generated image [default 'wordcloud.png'].
  -t TITLE --title=TITLE
                       Title for the image [default None].
  -m MAX --max MAX
                       Max number of words to display on the cloud word [default 150].
  -w WIDTH --width WIDTH
                       Width of the generate image [default 400].
  -H HEIGHT --height HEIGHT
                       Height of the generate image [default 300].
  INFILE               A text file to read.

πŸ“ TODO

  • Start it, from this example,
  • Run it on some interesting examples, embed them here (as images),
  • Check on weird encodings? (i.e., not UTF-8). It works fine!
  • Test it against πŸ“• VERY large files (millions of lines) ? It works fine, slowly but fine.
  • Test it against πŸ“š LOTS of files (several thousands) ? It works fine, slowly but fine.
  • Publish it on PyPI: it is available at pypi.python.org/pypi/generatewordcloud/
  • Write a small article about it for my blog.

πŸ› Knows issues

  • Only tested on (X)Ubuntu (15.10), but it should work on other GNU/Linux distribution and Mac OS X (and probably Windows), if they support docopt and has both docopt and word_cloud installed.

πŸ› Unknown issues?

Use the issue tracker to notify me of a bug!


About

Why write this script?

There already is a lot of good cloud word generator online, e.g. wordle.net.

  1. I wanted a way to visualize the major keywords of Bash and Python (my two favorite programming languages) and of Markdown/Strapdown, reStructuredText and LaTeX (my favorite typeset documents system),
  2. The original project word_cloud seemed cool. And it is. Great job @amueller πŸ‘ !
  3. Clouds of words are interesting! And Python is awesome!

Author

Lilian Besson (Naereen).

πŸ“œ License ? GitHub license

This plug-in is published under the terms of the GPLv3 License (file LICENSE), Β© Lilian Besson, 2016.

Maintenance Ask Me Anything ! Analytics made-with-python

ForTheBadge uses-badges ForTheBadge uses-git

ForTheBadge built-with-love

About

🐍 A simple Python (2 or 3) script to generate a PNG word-cloud ☁️ image from a bunch of πŸ“‚ text files πŸŽ‰. Based on word_cloud by @amueller.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published