Now you don't need Twitter to keep up with the music world.
I. Introduction
- What is Candyfloss?
- Why Candyfloss?
- Project Goals & Intended Audience
- Installing Dependencies
- Running the Code
- Running Tests
III. Architecture
- Overall Architecture
- How Candyfloss Works
IV. Style Guide
- CSS
- The Code Itself
- Accessibility
- API Overview
- API for a Specific Publication
- Database API
VI. Giving Thanks
- Next Steps
- How Can You Contribute?
- Closing Credits
- Cited Sources
- Candyfloss Outlets
- Outlets to Add
VII. Glossary
- Python
- Beautiful Soup
- Requests
- lxml
- Pytest
- Pylint
- yapf
- Flask
- PythonAnywhere
- Virtual Environment
- Dependencies
- Pip
- SQLite
- API (Application Programming Interface)
- JSON (JavaScript Object Notation)
Candyfloss is a digital daily newspaper curating the best music news and essays. Now you don't need Twitter to keep up with the music world. Candyfloss's feed has a strict cut-off to fight endless scrolling. It displays the 50 most recent links from several outlets and refreshes every hour.
To use Candyfloss:
- Open Candyfloss: https://www.candyfloss.app/.
- Click a link.
- Enjoy!
The problem: I don't want to rely on social media for music news.
The solution: I created a simple RSS reader web app.
Candyfloss aims to replace Twitter as your source for music news. Additional news outlets and improvements to the code's web scraping are pending.
An ideal user is a fellow music journalist or music fan who wants to discover some of the best music writing today without social media.
The deployed app: https://www.candyfloss.app/.
To work with Candyfloss locally, download the most updated versions of the following tools and libraries (unless a specific version is noted):
- Python (3.8.6)
- Flask
- Beautiful Soup
- Requests
- lxml
- pytest
- pylint
- yapf
- SQLite
- DB Browser for SQLite
- Visual Studio Code
- PythonAnywhere
- Google's "RSS Subscription Extension" Chrome plugin
- Icons8 (for the current corn favicon)
- Python Dotenv
More requirements can be found in requirements.txt
.
To run Candyfloss locally:
- Open your IDE of choice.
- Open a terminal window.
- Enter and run this command:
flask --debug run
. - Open the development server URL provided in the terminal output.
- You should now see Candyfloss!
Pytest is testing app.py
and each file found in the feeds
folder. More feed and database testing to come. To run pytests for Candyfloss:
- Open a new terminal window.
- Enter and run the command
pytest -v
.
The most important folders and files to know:
feeds
: holds the web scraping for each publication.static
: contains images andstyles.css
.templates
: the HTML you see on the browser.tests
: where I testapp.py
andfeeds
using pytest.app.py
: where I combine the feeds into one clean and organized feed that is then rendered totemplates
.
Below is the workflow for adding new outlets:
I first find a publication's working RSS feed. If a website doesn't promote its RSS, I can find it by typing "/rss
" or "/feed
" at the end of a URL, or I use Google's "RSS Subscription Extension" Chrome plugin.
To publications that make their RSS feeds easy to find: Thank you!
Each file in the feeds
folder is where I use Beautiful Soup, requests, and lxml to call and clean up the RSS. This is done in three parts:
"Get" the soup: make my GET request:
def get_soup():
html_text = requests.get(
'RSS URL', timeout=10).text
return BeautifulSoup(html_text, 'xml')
soup = get_soup()
"Cook" the soup: pinpoint the repeating element holding the content I want:
def cook_soup(): # each article is in an <item/>
return soup.find_all('item')
articles = cook_soup()
"Deliver" the soup: use a for
loop to append the info I need into my empty lists, which I then combine into a new dictionary that looks like the following:
PUBLICATION = [
{'idx': idx,
'title': title,
'URL': URL,
'author': author,
'publication': publication,
'date': date}
for idx, title, URL, author, publication, date in zip(index_list, title_list, URL_list, author_list, publication_list, date_list)
]
REFACTORING NOTES: Since first deploying this app, I've refactored these steps to include an "Outlet" class to better abstract the structure and abilities of a publication.
p4k.py
displays how this process began.p4k_class.py
shows how this process has evolved. Future work will include ways to take more advantage of class methods to simplify repeating logic.It can also be challenging when information is missing, mostly from publications not crediting their authors or isn't formatted like most RSS feeds. The most common examples of the latter involves time and dates, which I clean up and standardize using Python's
datetime
functionality.
In app.py
, I import all the publication feeds, combine them into one feed, and then use a sorting function to order this new feed by each link's date. I also slice away any publication links after a set number, which for now is 50. This cleaned-up feed is then rendered into my main app route, along with the current date at any given time.
# combining our feeds
link_dicts = pub1 + pub2 + pub3 + etc
# ordering our combined feed by date
link_dicts_sorted = sorted(link_dicts, key=lambda i: i['date'], reverse=True)
# reducing our feed to return a specific set number
link_dicts_sorted_and_reduced = link_dicts_sorted[0:50]
Candyfloss's styling is inspired by the print covers of the 'London Review of Books.'
Candyfloss's CSS is all done in styles.css
. Media queries are currently set for the following break points:
- 992px: most iPads and Surface Pros.
- 600px: most iPhones and Samsung Galaxies.
- 360px: for the Galaxy Fold.
Color CSS variables are defined as:
--black: #0d0d0d;
--white: #ffffff;
--gold: #ffd900;
--blue: #0026ff;
Candyfloss follows Google's Python style guide as closely as possible. This involves:
- Using
pylint
for automated code linting. - Using
yapf
for auto-formatting. - Including Google's settings file for Vim and
pylintrc
.
The deployed Candyfloss app received an overall pass on mobile and desktop Lighthouse reports. Areas of improvement include addressing the performance on mobile due to speeds of first contentful paint, time to interactive, and total blocking time.
Desktop:
- Performance: 100
- Accessibility: 100
- Best Practice: 92
- SEO: 90
Mobile:
- Performance: 84
- Accessibility: 100
- Best Practice: 92
- SEO: 92
To view the API for Candyfloss's entire feed:
- Compete the steps found in Installing Dependencies.
- Find and click the end of the local server URL.
- Type in "
/api
" to the end of the URL. - Press enter.
- You'll now see the API!
To view the API for a specific publication:
- Follow the previous "Overview of Candyfloss's API" steps.
- At the end of the URL, type in "
/PUBLICATION-NAME
" (i.e. "/api/Pitchfork
"). - Press enter.
- You'll now see the API for your specific publication.
For now, the spelling case does matter. For example, you need to write "Pitchfork" as a proper noun. Please go to the "Candyfloss Outlets" section of this README to see what publications are currently available to view on this API and how to spell them.
Candyfloss uses an SQLite database currently holding the outlets and RSS links being scraped.
To view this list of scraped outlets:
- At the end of your local server URL, add "
/db
".
Future refactoring will make this database more dynamic and directly pull from all the feeds being imported ino
app.py
, and make it visible on the deployed app.
I'm making updates to Candyfloss whenever I can. Future actions to take include:
- Updating SQLite database to PostgreSQL
- Updating
pytest
to now account for object and database refactoring and more closely follow Google's Python style guide - Utilizing class methods to further abstract some of my repeating logic when building and cleaning up feeds
- Refactoring older feed files to incorporate my new class structures
- Adding more publications
- Fleshing out the app's overall metadata
- Adding a search field on the UI
- Expanding upon the current 404 page
- Utilizing relative data analysis or machine learning
Any way you can! I'm looking for help to flesh out my pytest automated testing, and suggestions on new outlets to add.
A special shout-out to Nish Tahir for giving thoughtful feedback on an early version of this app, and James Bennington for his guidance on how to document my code.
Candyfloss would not be possible without the following:
- Pallets's intro to Flask is a recommended starting point for anyone wanting to explore Flask.
- Waweru Mwaura has a great blog post on the basics of using pytest with Flask.
- I also want to thank Magnitopic for their helpful YouTube video on how to deploy a Flask app to PythonAnywhere.
- Digital Ocean, CodingCasually, and ProfessorPitch were all helpful with the initial local connection to SQLite.
- Doodles by Doodle Ipsum.
Candyfloss pulls from the following outlets:
- OPE! (my own music blog)
- Pitchfork (album reviews)
- Stereogum (new music)
- Aquarium Drunkard (latest posts)
- The Ringer (music section)
- Fluxblog (substack)
- Music Journalism Insider (substack)
- Penny Fractions (ghost)
- Chicago Reader (Gossip Wolf column)
- Uproxx (music section)
- Abundant Living (ghost)
- Billboard (Chart Beat column)
- No Bells (latest posts)
- The Quietus (reviews)
- Loud And Quiet (reviews)
- No Depression (reviews)
- So It Goes (substack)
- Reply Alt (substack)
- Wire (In Writing column)
- Passion of the Weiss (latest posts)
- New York Times (music section)
- The Guardian (music section)
- NME (music features)
- VAN (classical music features)
- The Alternative (new music)
The outlets I want to add next:
- Bandcamp
- Creem
- Eight-Bit Theory
- Four Columns
- GQ
- SPIN
- Vulture
- Last Donut of the Night
- BBC
- Mix Mag
- Slate
- Atlantic
- Aeon
- OkayPlayer
- Music Business Worldwide
- Texas Monthly
- John's Music Blog
- 4Columns
- New Yorker
- NPR
- ... and more!
A high-level, interpreted programming language that is widely used in web development, scientific computing, data analysis, artificial intelligence, and more.
A Python library used for web scraping purposes to extract the data from HTML and XML files. It provides a set of simple methods to navigate and search the parse tree created from the HTML/XML source.
A Python library used for making HTTP requests. It provides a simple and elegant way to send HTTP/1.1 requests using Python. It supports HTTP/2, SSL/TLS, and authentication.
A Python library used for processing XML and HTML documents. It provides a simple and powerful API to parse, validate, and manipulate XML and HTML documents.
A Python testing framework used to write and run unit tests and a popular alternative to Python's built-in unit test module.
A Python library used for analyzing Python source code for errors and enforcing coding standards. It provides a set of rules and guidelines to help improve the quality and maintainability of Python code.
A Python library used for formatting Python code according to a consistent style. It provides a simple and configurable way to automatically format Python code.
A micro web framework written in Python. It is classified as a micro-framework because it does not require particular tools or libraries.
A cloud-based Python development and hosting platform. It provides a web-based Python development environment, a Python web hosting service, and a set of tools and features to help developers build and deploy Python applications easily.
A tool that helps to keep dependencies required by different projects separate by creating isolated Python environments for them.
External packages or libraries that are required by a Python application to run properly.
The package installer for Python. You can use pip to install Python packages from the Python Package Index and other indexes.
A software library that provides a relational database management system.
A set of protocols, routines, and tools used for building software applications. It specifies how software components should interact and makes it easier to develop software by providing pre-built components that can be used to build larger applications.
A lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. It is a text format that is completely language-independent.
© 2023 Brady Gerber. All Rights Reserved.