Skip to content

Commit

Permalink
Merge pull request #1 from AbuseIO/edits-before-bluepencil
Browse files Browse the repository at this point in the history
Some changes before blue-pencil
  • Loading branch information
widooo authored Jul 23, 2024
2 parents 1813725 + 55d6df4 commit 8c1a9ae
Show file tree
Hide file tree
Showing 13 changed files with 173 additions and 202 deletions.
51 changes: 26 additions & 25 deletions docs/architecture/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ code in the form of procedures (often known as methods). In OOP, computer progra
are designed by making them out of objects that interact with one another.

Important is the use of the WinterCMS framework. SCARt uses as much as possible the
standard functionality of this framework. See here for the [WinterCMS Development Guide](https://wintercms.com/docs/v1.2/docs/architecture/developer-guide#html-element-naming).
standard functionality of this framework. See here for the [WinterCMS Development Guide](https://wintercms.com/docs/v1.2/docs/architecture/developer-guide).

SCARt also adapts the version numbering of "major.minor.point". For example v1.0.1 or v5.3.2.

Expand All @@ -52,9 +52,9 @@ See here for more information about the versioning: [WinterCMS plugin version hi

## Containers

The MVC is futher enhanced in SCART by containerizing the different system components. This
makes them scalable and provide failover functionality. With containers the application
components are also placed in a seperated (local) network.
The MVC framework is futher enhanced in SCART by containerizing the different system
components. This makes them scalable and provide failover functionality. With containers
the application components are also placed in a seperate (local) network.

A basic container setup:

Expand All @@ -69,63 +69,64 @@ Within the PHP-CRON the background work exists of the following jobs:

| Name | Description |
|:--------------|:--------------------------------------------------------------------------------------------------------------------------------------|
| ImportExport | Import by email or ICCAM the reports |
| ImportExport | Import from email or ICCAM of the reports |
| AnalyzeInput | Read the imported reports, scrape them and get the WhoIs information |
| CheckNTD | Checks if the illegal reports are still online and the WHoIs the same (*) |
| CheckNTD | Checks if the illegal reports are still online and if the Whois-information is unchanged (*) |
| SendNTD | Send NTD by email (or API) to the hoster, registrar, site owner or LEA |
| SendAlert | Send alerts to the info mailbox about the actions done by the background jobs |
| UpdateWHoIs | Update the WhoIs every 12 hours |
| UpdateWHoIs | Update the from Whois retreived information every 12 hours |
| CreateReports | Create the user reports (export CSV files) |
| Cleanup | Every night this background job runs to cleanup the SCARt environment (*) |
| Archive | In some SCARt environments the number of reports are such big, archiving is needed to keep the runtime performance optimized |
| Archive | In some SCARt environments the number of reports are so large, archiving is needed to keep the runtime performance optimized |

(*) see seperated chapters for more information
(*) see the dedicated chapters for more information

It's easy within docker to make for each of these jobs a seperated container. In this
way the performance can be optimzied.
It's easy within docker to create a dedicated container for each of these jobs. In this
way the performance can be optimized.

The following container setup is an example for an optimized setup:

![containers-Performance.png](containers-Performance.png)

In this setup the ImportExport and CheckNTD are placed in a seperated containers with
an own work and resource environment.
In this setup the ImportExport and CheckNTD are placed in seperated containers with
their own work and resource environment.

## Realtime online check

The CheckNTD job is responsible for the check if an URL is still online.
The CheckNTD job is responsible for checking whether an URL is still online.

The standard CheckNTD job is a single PHP job (threat) which starts a headless browser an
The standard CheckNTD job is a single PHP job (threat) which starts a headless browser and
checks each URL.

For bigger hotlines with a lot of illegal URLs to check, there is also the realtime version
of the CheckNTD job. This realtime version used pooling to start and stop dynamically threats
of the CheckNTD job. This realtime version uses pooling to start and stop dynamically threads
for checking the online status. Configuration consist of:

- maximum time within an URL has be checked again (default 4 hours)
- minimum time after which an uRL has to be checked (default 1 hour)
- minimum time after which spinning down a worker (default 15 min)
- minimum time after which a worker will be spinned down (default 15 min)

There is no limit from the number of concurrent threats other then the resources on the
There is no limit for the number of concurrent threads other then the resources on the
hosting server(s).

Note that the threat job not only checks the online status of an image but also the
WhoIs information. The hosting (country) information can be changed. On that moment the
Note that the thread job not only checks the online status of an image but also the
Whois information. The hosting (country) information can change. On that moment the
report is placed in the status "CHANGED".

A report is set "offline" when 3 times after eachother, with a delay of 3 minutes, the
A report is set "offline" when 3 times after each other, with an interval of 3 minutes, the
image (hash) is not found online.

### Illegal content browsing

With the docker and the threating setup, the following secure browsing environment exists:
![img_1.png](img_1.png)<br />
Within the browser container (docker), the website is analyzed and media is download for
classification. The browser context (website with illegal content) is reset after the
scraping of the website. The headless browser environment has no direct contact with the

Within the browser container (docker), the website is analyzed and media is downloaded for
classification. The browser context (website with illegal content) is reset after scraping
of the website. The headless browser environment has no direct contact with the
other SCARt components and/or server and/or SCART client user.

The setup of this realtime version is done by the S3group. They have to knowledge to configure
The setup of this realtime version is done by the S3group. They have the knowledge to configure
and maintain this environment. Please ask your SCARt contact for more information.


40 changes: 16 additions & 24 deletions docs/basic/classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ can select one or more reports to work on.
## Classify

**Note: the CLASSIFY function is the most "important" user screen with a lot of options
to control everything. Advicable is to use this function on a large (bigger) screen.**
to control everything. Advisable is to use this function on a large (bigger) screen.**

When you open a report you find the following screen:

Expand All @@ -27,7 +27,7 @@ You find the following general controls:
| Hide... | Hide records or not |
| Select... | Select records based on status and current selection |
| Bulk... | Do an action on the selected records |
| ![img_8.png](../images/img_8.png) | Regresh the screen |
| ![img_8.png](../images/img_8.png) | Regresh the screen |

Special controls:

Expand All @@ -45,48 +45,40 @@ to the bottom till all the images are showed.

In the LIST view you see only a group of records (5, 10, 25, or 50) and you can page
through all the records. This view is handy when there are al lot of records (>100)
so the display of the classify screen is very buys with loading the images.
and the classify screen becomes very busy with loading the images.

In each view the following buttons for each record are supported:

| Button | Action | Description |
|:----------------------------------------|:-------------------|:-|
| ![img_11.png](../images/img_11.png) | Set ILLEGAL |Illegal classify|
| ![img_10.png](../images/img_10.png) | Set NOT ILLEGAL |Not illegal classify|
| ![img_11.png](../images/img_11.png) | Set ILLEGAL |Classify illegal|
| ![img_10.png](../images/img_10.png) | Set NOT ILLEGAL |Classify legal|
| ![img_12.png](../images/img_12.png) | Set IGNORE |Ignore (eg icon)|
| ![img_13.png](../images/img_13.png) | Set FIRST POLICE |Send to police and wait|
| ![img_14.png](../images/img_14.png) | Set MANUAL |Manual check if online|
| ![img_13.png](../images/img_13.png) | Set FIRST POLICE |Send to LEA and wait|
| ![img_14.png](../images/img_14.png) | Set MANUAL |Manual check whether online|
| ![img_15.png](../images/img_15.png) | Edit record fields |Edit different fields|


The FIRST POLICE is only possible when the record is classified as ILLEGAL. The
abusecontact with POLICE marked (on) will be informed by email with all the
records (urls) marked with FIRST POLICE.
FIRST POLICE is only possible when the record is classified as ILLEGAL. The
abuse contact marked as POLICE will be informed by email with all the
records (URL's) marked with FIRST POLICE.

Set MANUAL is also only possible when the record is ILLEGAL. When this is set
SCARt will automatically only check the WhoIs information and not if the
url (image) is online. In the function CHECKONLINE records can be set
offline.
SCARt will automatically only check the Whois information and will not check
whether the URL (image) is online. In the function CHECKONLINE records can be set
offline.

With EDIT a number of fields can be updated:

![img_16.png](../images/img_16.png)

## Rules

With rules the flow and (eg) hosting setting of records can be overruled. You can
set the hoster or site owner based on the domain or set a proxy service (like
Cloudfest) for determining the real IP.
With rules the flow and e.g. hosting records can be overruled. You can set the hoster
or site owner based on the domain or set a proxy service (like CloudFlare) for
determining the real IP.

The RULES function is available within the classify function with only the options
(domains) valid for the records and as general function with all the possiblities.

See [Rules](../details/rules.md) for more information.








33 changes: 17 additions & 16 deletions docs/basic/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,24 +14,25 @@ SCARt has several ways for importing reports:
## Manual input

In the INPUTS function you can use CREATE to manual add an input. Required is a
valid url, workuser, source and type. Default this input will set on the scrape
status so the input (URL) will automatically picked up by the SCARt background
process to analyze and scrape.
valid URL, workuser, source and type. By default this input will be set on the
scrape status so the input (URL) will automatically be picked up by the SCARt
background process to analyze and scrape.

## ICCAM

SCARt supports ICCAM API version v2 and v3.
SCARt supports ICCAM API version v2 and v3.

The main difference With version v3 and v2 is that in v3 the synchronization is done
on a much directer one-to-one level between SCARt and ICCAM.

System admin has to configure:
- ICCAM certificate (one time each year)
- ICCAM certificate (once a year)
- ICCAM API user and password

## By upload

In the SCARt functionm INPUTS you can import a CSV file with on every line:
In the SCARt functionm INPUTS you can import a CSV file, and every line can consist
of:

`
URL;REFERER;WORKUSER-EMAIL;REFERENCE;SOURCE;TYPE
Expand All @@ -42,17 +43,17 @@ Note: URL is **required**, the other fields are optional.
## By email

SCARt can read a mailbox to receive webform input. The admin has to configure
the mailbox account. After this, all email in this mailbox will automatically
process by SCART.
the mailbox account. After this, all email in this mailbox will automatically be
processed by SCART.

Note that for protecting reasons, a whitelist access policy is setup. Within SCARt
the admin can put sender email address on the whitelist (SETTINGS -> WHITE LIST).
Is a sender from an email not on this list then the import email will not be accepted.

### Email import of reports

You can send an email with as subject "SCART-INPUT" to the mailbox. Each line can
hold:
You can send an email with the subject "SCART-INPUT" to the mailbox. Each line can
consist of:

`
URL;REFERER;NOTE
Expand All @@ -66,7 +67,7 @@ Note:

### Email import with source specification

You can send an email with as subject "SCART-INPUT-SOURCE [source]" to the mailbox
You can send an email with the subject "SCART-INPUT-SOURCE [source]" to the mailbox
with "[source]" replaced with the source you want to be set. if the source is not
found in SCARt, then the source is automatically added.

Expand All @@ -75,14 +76,14 @@ The body format is the same as above (including the note).
### Email import Content Removed

You can send an email with as subject "SCART-CONTENTREMOVED" to the mailbox. Each
line can hold the url of the report in SCARt to be set on CLOSE, including
sending to ICCAM the CR action.
line can hold the url of the report in SCARt to be set on CLOSE. This will result
in a notice to ICCAM for the CR action.

### Email import Content Unavailable

You can send an email with as subject "SCART-CONTENTUNAVAILABLE" to the mailbox. Each
line can hold the url of the report in SCARt to be set on CLOSE, including
sending to ICCAM the CU action.
You can send an email with the subject "SCART-CONTENTUNAVAILABLE" to the mailbox. Each
line can hold the url of the report in SCARt to be set on CLOSE. This will result
in a notice to ICCAM for the CU action.

### Custom webform email import

Expand Down
13 changes: 7 additions & 6 deletions docs/basic/ntd.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

---

Within the NTD function you can find the Notice & Take Down (NTD) messages send
and which are waiting te be send.
Within the NTD function you can find the Notice & Take Down (NTD) messages that
were sent and the ones which are waiting te be send.

## Sending NTD's

Expand All @@ -13,8 +13,8 @@ There are a number of moments when SCARt starts a NTD:
- after 24 hours when the illegal content is still online
- when marked as POLICE in classify (ntd to police contact)

After "starting" a NTD, SCARt will group urls for the samen abusecontact until the
hour-threshold is reached.
After "starting" a NTD, SCARt will group URL's for the same abuse contact until
the hour-threshold is reached.

Default hour-thresholds:

Expand All @@ -25,8 +25,9 @@ Default hour-thresholds:

Note: these thresholds can be set invidual for each SCARt environment.

Before actual sending a NTD, SCARt will last minute check for each attached url
if the hoster is still the same. If not, the url will be removed from the NTD.
Before actual sending a NTD, SCARt will last minute check for each attached URL
whether the hoster is still the same. If not, the URL will be removed from the
NTD.

## NTD email template

Expand Down
12 changes: 6 additions & 6 deletions docs/basic/report.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@

---

In SCARt a basic reporting module is include which export the data into a CSV
In SCARt a basic reporting module is included which export the data into a CSV
(comma seperated values) file.

You can create a report in SCARt. After this, the report will be run in the background
because of the possible time needed to generate the report. After finishing, an email
is send to the report notify email address and the report can be downloaded when you
open the report.
You can create a report in SCARt. After this, the report will be created in the
background because of the time needed to generate the report. When finished, an
email is send to the report notify email address with a link to download the
report.

## Columns

Expand All @@ -19,7 +19,7 @@ and also in which order.

The export format is CSV with the field content enclosed in quotes like:

"field 1";"field 2";.."Field n";
"Field 1";"Field 2";.."Field n";

This file can directly be opened by spreadsheet programmas like Microsoft Excel or
Libreoffice Calc.
Expand Down
31 changes: 4 additions & 27 deletions docs/details/cleanup.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@
Every night the cleanup background job runs to do a checkup from the SCARt environment.
The following actions are done:

1. Recycle of the SCARt application logfile
1. Recycle the SCARt application logfile
2. Reset for scraping-again from inputs-open-for-classify and not look at in the past 24 hours
3. Remove cached images that have finished being analyzed
4. Cleanup of the WhoIs cache; removal not active domain and/or IP records
3. Remove cached images when analyzing is finished
4. Cleanup of the Whois cache; removal of not active domains and/or IP records
5. Rewind the ICCAM import one day to be sure every ICCAM report is imported
6. Make anonymous if the retention time is met
7. Cleanup "deleted marked records" in the database

## Anonymous

SCARt can be configured (not standard) to anonymouse privacy related fields. These fields
SCARt can be configured (not standard) to anonymize privacy related fields. These fields
include:

- URL
Expand All @@ -35,26 +35,3 @@ reports without the privacy information can be reported (exported).

The specific retention time has to be configured. Please contact your SCARt contact for
more information.























Loading

0 comments on commit 8c1a9ae

Please sign in to comment.