Skip to content

Latest commit

 

History

History
92 lines (48 loc) · 1.91 KB

ReadMe.md

File metadata and controls

92 lines (48 loc) · 1.91 KB

RosettaCode Data Project

This git repository contains (almost) all of the code samples available on http://rosettacode.org organized by Language and Task.

Getting the Data

All of the data is in this repository, so you can just run:

git clone https://github.com/acmeism/RosettaCodeData

However...

It's a lot of data!

If you just want the latest data, the quickest thing to do is:

git clone https://github.com/acmeism/RosettaCodeData --single-branch --depth=1

Tools

This repository's data content is created by a Perl program called rosettacode.

You can install it with this command:

cpanm RosettaCode

You can rebuild the data with:

make build

This repository has a bin directory with various tools for working with the data.

  • rcd-api-list-all-langs

    List all the programming language names directly from rosettacode.org

  • rcd-api-list-all-tasks

    List all the programming task names directly from rosettacode.org

  • rcd-new-langs

    List the RosettaCode languages not yet add to Conf

  • rcd-new-tasks

    List the RosettaCode tasks not yet add to Conf

  • rcd-samples-per-lang

    Show the number of code samples per language

  • rcd-samples-per-task

    Show the number of code samples per task

  • rcd-tasks-per-lang

    Show the number of tasks with code samples per language

  • rcd-langs-per-task

    Show the number of languages with code samples per task

To Do

Pull requests welcome!

This project is not a perfect representation of RosettaCode yet. It has a few uncicode issues. It also has to deal with various formatting mistakes in the mediawiki source pages.

  • Fix bugs

  • Correct the 100s of guessed file extensions in Conf/lang.yaml

  • Ability to only fetch cache pages since last pushed data update

  • Support names with non-ascii characters

  • Add more bin tools

  • Address errors reported in rosettacode.log after running make build