A simple python package for downloading files from https://ilias.uni-mannheim.de.
- Automatically synchronizes all files for each download. Only new or updated files and videos will be downloaded.
- Uses the BeautifulSoup package for scraping and the multiprocessing package to accelerate the download.
Easy way via pip:
pip3 install iliasDownloaderUniMA
Otherwise you can clone or download this repo and then run
python3 setup.py install
inside the repo directory.
Starting from version 0.5.0, only your uni_id and your password is required. In general, a simple download script to download all files of the current semester looks like this:
from IliasDownloaderUniMA import IliasDownloaderUniMA
m = IliasDownloaderUniMA()
m.setParam('download_path', '/path/where/you/want/your/files/')
m.login('your_uni_id', 'your_password')
m.addAllSemesterCourses()
m.downloadAllFiles()
The method addAllSemesterCourses()
adds all courses of the current semester
by default. However, it's possible to modify the search behaviour by passing a regex
pattern for semester_pattern
. Here are some examples:
# Add all courses from your ilias main page from year 2020:
m.addAllSemesterCourses(semester_pattern=r"\([A-Z]{2,3} 2020\)")
# Add all FSS/ST courses from your ilias main page:
m.addAllSemesterCourses(semester_pattern=r"\((FSS|ST) \d{4}\)")
# Add all HWS/WT courses from your ilias main page:
m.addAllSemesterCourses(semester_pattern=r"\((HWS|WT) \d{4}\)")
# Add all courses from your ilias main page. Even non-regular semester
# courses like 'License Information (Student University of Mannheim)',
# i.e. courses without a semester inside the course name:
m.addAllSemesterCourses(semester_pattern=r"\(.*\)")
You can also exclude courses by passing a list of the corresponding ilias ref ids you want to exclude:
# Add all courses from your ilias main page. Even non-regular semester
# courses. Except the courses with the ref id 954265 or 965389.
m.addAllSemesterCourses(semester_pattern=r"\(.*\)", exclude_ids=[954265, 965389])
A more specific example:
from IliasDownloaderUniMA import IliasDownloaderUniMA
m = IliasDownloaderUniMA()
m.setParam('download_path', '/Users/jonathan/Desktop/')
m.login('jhelgert', 'my_password')
m.addAllSemesterCourses(exclude_ids=[1020946])
m.downloadAllFiles()
Note that the backslash \
is a special character inside a python string.
So on a windows machine it's necessary to use a raw string for the download_path
:
m.setParam('download_path', r'C:\Users\jonathan\Desktop\')
The Parameters can be set by the .setParam(param, value)
method, where
param
is one of the following parameters:
'num_scan_threads'
number of threads used for scanning for files inside the folders (default: 5).'num_download_threads'
number of threads used for downloading all files (default: 5).'download_path'
the path all the files will be downloaded to (default: the current working directory).'tutor_mode'
downloads all submissions for each task unit once the deadline has expired (default:False
)'verbose'
printing information while scanning the courses (default:False
)
from IliasDownloaderUniMA import IliasDownloaderUniMA
m = IliasDownloaderUniMA()
m.setParam('download_path', '/Users/jonathan/Desktop/')
m.setParam('num_scan_threads', 20)
m.setParam('num_download_threads', 20)
m.setParam('tutor_mode', True)
m.login('jhelgert', 'my_password')
m.addAllSemesterCourses()
m.downloadAllFiles()
Since some lecturers don't use ILIAS, it's possible to use an
external scraper function via the addExternalScraper(fun, *args)
method. Here fun
is the external scraper function and *args
are the corresponding variable number of arguments.
Note that's mandatory to use course_name
as first function
argument for your scraper. Your external scraper is expected to
return a list of dicts with keys
# 'course': <the course name>
# 'type': 'file'
# 'name': <name of the parsed file>
# 'size': <file size (in mb) as float>
# 'mod-date': <the modification date as datetime object>
# 'url': <file url>
# 'path': <path where you want to download the file>
Here's an example:
from IliasDownloaderUniMA import IliasDownloaderUniMA
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from dateparser import parse
import requests
def myExtScraper(course_name, url):
"""
Extracts all file links from the given url.
"""
files = []
file_extensions = (".pdf", ".zip", ".tar.gz", ".cc", ".hh", ".cpp", ".h")
soup = BeautifulSoup(requests.get(url).content, "lxml")
for link in [i for i in soup.find_all(href=True) if i['href'].endswith(file_extensions)]:
file_url = urljoin(url, link['href'])
resp = requests.head(file_url)
files.append({
'course': course_name,
'type': 'file',
'name': file_url.split("/")[-1],
'size': 1e-6 * float(resp.headers['Content-Length']),
'mod-date': parse(resp.headers['Last-Modified']),
'url': file_url,
'path': course_name + '/'
})
return files
m = IliasDownloaderUniMA()
m.login("jhelgert", "my_password")
m.addAllSemesterCourses()
m.addExternalScraper(myExtScraper, "OOP for SC", "https://conan.iwr.uni-heidelberg.de/teaching/oopfsc_ws2020/")
m.downloadAllFiles()
Feel free to contribute in any form! Feature requests, Bug reports or PRs are more than welcome.