- Usage
- Demo PySpark Penguin App
- Directory Structure
- License
- Installation
- Contributing
- Code of conduct
Taipy is a Python library for creating Business Applications. More information on our website.
Demo PySpark Penguin App focuses on the seamless integration of PySpark (Empowering big data processing with Python simplicity and Spark speed) with Taipy, a Python library used for pipeline orchestration and scenario management.
- Level: Intermediate
- Topic: GUI/Core
This demo works with a Python version superior to 3.8. Install the dependencies of the requirements.txt and run the main.py.
We'll design a workflow which performs two main tasks: 1- Spark task (spark_process):
- Load the data;
- Group the data by "species", "island" and "sex";
- Find the mean of the other columns ("bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g");
- Save the data.
2- Python task (filter):
- Load the output data saved previously by the Spark task;
- Given a "species", "island" and "sex", return the aggregated values.
src/
: Contains the demo source code.data/
├─penguin.csv
: the data as downloaded from the palmerpenguins git repo.penguin_spark_app.py
: Spark application.config.py
: Taipy configuration which models our data workflow.main.py
: the main script (including our application gui).
CODE_OF_CONDUCT.md
: Code of conduct for members and contributors of demo-dask-customer-analysis.CONTRIBUTING.md
: Instructions to contribute to demo-dask-customer-analysis.INSTALLATION.md
: Instructions to install demo-dask-customer-analysis.LICENSE
: The Apache 2.0 License.README.md
: Current file.
Copyright 2022 Avaiga Private Limited
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Want to install Demo Dask Customer Analysis? Check out our INSTALLATION.md
file.
Want to help build Demo Dask Customer Analysis? Check out our CONTRIBUTING.md
file.
Want to be part of the Demo Dask Customer Analysis community? Check out our CODE_OF_CONDUCT.md
file.