Modin-spreadsheet is the underlying package for the Modin Spreadsheet API. It renders DataFrames within a Jupyter notebook as a spreadsheet and makes it easy to explore with intuitive scrolling, sorting, and filtering controls. The spreadsheet allows click editing, adding/removing rows, etc. and can also be controlled using the API. Modin-spreadsheet also records the history of changes made so that you can share or reproduce your results.
Modin-spreadsheet builds on top of SlickGrid and Modin to provide a highly responsive experience even on DataFrames with 100,000 rows.
Modin-spreadsheet is forked from Qgrid, which was developed by Quantopian. Some documentation will reference Qgrid documentation as we continue to build out our own documentation. To learn more about Qgrid, here is an introduction on YouTube.
Here is an example of the Modin-spreadsheet widget in action.
A brief demo showing the common use cases for Modin-spreadsheet: filtering, editing, sorting, generating reproducible code, and exporting the changed dataframe
Full documentation for Modin-spreadsheet is still in progress. Most features are documented on Qgrid's readthedocs: https://qgrid.readthedocs.io/.
Modin-spreadsheet is intended be used through the Modin Spreadsheet API (Docs in progress...). Please install Modin and Modin-spreadsheet by running the following:
pip install modin pip install modin[spreadsheet]
To enable the Modin-spreadsheet widget, you may need to also run:
jupyter nbextension enable --py --sys-prefix modin_spreadsheet # only required if you have not enabled the ipywidgets nbextension yet jupyter nbextension enable --py --sys-prefix widgetsnbextension
If needed, Modin-spreadsheet can be installed through PyPi.
pip install modin-spreadsheet
Column-specific options: The feature enables the ability to set options on a per column basis. This allows you to do things like explicitly specify which column should be sortable, editable, etc. For example, if you wanted to prevent editing on all columns except for a column named 'A', you could do the following:
col_opts = { 'editable': False } col_defs = { 'A': { 'editable': True } } modin_spreadsheet.show_grid(df, column_options=col_opts, column_definitions=col_defs)
See the show_grid documentation for more information.
Disable editing on a per-row basis: This feature allows a user to specify whether or not a particular row should be editable. For example, to make it so only rows in the grid where the 'status' column is set to 'active' are editable, you might use the following code:
def can_edit_row(row): return row['status'] == 'active' modin_spreadsheet.show_grid(df, row_edit_callback=can_edit_row)
Dynamically update an existing spreadsheet widget: These API allow users to programmatically update the state of an existing spreadsheet widget:
- edit_cell
- change_selection
- toggle_editable
- change_grid_option (experimental)
MultiIndex Support: Modin-spreadsheet displays multi-indexed DataFrames with some of the index cells merged for readability, as is normally done when viewing DataFrames as a static html table. The following image shows Modin-spreadsheet displaying a multi-indexed DataFrame:
Events API:
The Events API provides on
and off
methods which can be used to attach/detach event handlers. They're available
on both the modin_spreadsheet
module (see qgrid.on), and on
individual SpreadsheetWidget instances (see qgrid.QgridWidget.on).
Having the ability to attach event handlers allows us to do some interesting things in terms of using Modin-spreadsheet in conjunction with other widgets/visualizations. One example is using Modin-spreadsheet to filter a DataFrame that's also being displayed by another visualization.
Here's how you would use the on
method to print the DataFrame every time there's a change made:
def handle_json_updated(event, spreadsheet_widget): # exclude 'viewport_changed' events since that doesn't change the DataFrame if (event['triggered_by'] != 'viewport_changed'): print(spreadsheet_widget.get_changed_df()) spreadsheet_widget.on('json_updated', handle_json_updated)
Here are some examples of how the Events API can be applied.
This shows how you can use Modin-spreadsheet to filter the data that's being shown by a matplotlib scatter plot:
This shows how events are recorded in real-time. The demo is recorded on JupyterLab, which is not yet supported, but the functionality is the same on Jupyter Notebook.
If you'd like to contribute to Modin-spreadsheet, or just want to be able to modify the source code for your own purposes, you'll want to clone this repository and run Modin-spreadsheet from your local copy of the repository. The following steps explain how to do this.
Clone the repository from GitHub and
cd
into the top-level directory:git clone https://github.com/modin-project/modin-spreadsheet.git cd modin-spreadsheet
Install the current project in editable mode:
pip install -e .
Install the node packages that Modin-spreadsheet depends on and build Modin-spreadsheet's javascript using webpack:
cd js && npm install .
Install and enable Modin-spreadsheet's javascript in your local jupyter notebook environment:
jupyter nbextension install --py --symlink --sys-prefix modin_spreadsheet && jupyter nbextension enable --py --sys-prefix modin_spreadsheet
Run the notebook as you normally would with the following command:
jupyter notebook
If the code you need to change is in Modin-spreadsheet's python code, then restart the kernel of the notebook you're in and rerun any Modin-spreadsheet cells to see your changes take effect.
If the code you need to change is in Modin-spreadsheet's javascript or css code, repeat step 3 to rebuild Modin-spreadsheet's npm package, then refresh the browser tab where you're viewing your notebook to see your changes take effect.
There is a small python test suite which can be run locally by running the command pytest
in the root folder
of the repository.
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome. See the Running from source & testing your changes section above for more details on local Modin-spreadsheet development.
If you are looking to start working with the Modin-spreadsheet codebase, navigate to the GitHub issues tab and start looking through interesting issues.
Feel free to ask questions by submitting an issue with your question.