Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement I18n for scholia #1609

Open
wants to merge 109 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
109 commits
Select commit Hold shift + click to select a range
0bdca4d
Merge branch 'master' of github.com:WDscholia/scholia
DanielFLopez Aug 31, 2021
9ad63e4
Start
Aug 9, 2021
78c9966
add lang_code cookie and menu to select lang.
curibe Aug 11, 2021
30f6a48
Add langCode for wikiurls
Aug 11, 2021
6921f54
init jqueryi18n tests for location
curibe Aug 13, 2021
95abed2
Set english as first in language list
Aug 23, 2021
7412514
Fix li
Aug 23, 2021
2c3e8bf
Add i18n data attribute to html for location aspect
Aug 24, 2021
deed653
Add i18n data attribute to html for location_empty
Aug 24, 2021
03bfa60
Add i18n data attribute to html for software aspect
Aug 24, 2021
f4400cc
Add i18n data attribute to html for author aspect
Aug 24, 2021
938c48a
Add aspect in i18n data attribute to html location aspect
curibe Aug 24, 2021
cbfdc0c
Add aspect in i18n data attribute to html software aspect
curibe Aug 24, 2021
970e56a
Add aspect in i18n data attribute to html author aspect
curibe Aug 24, 2021
d9a6225
Add i18n data attribute to html for award aspect
Aug 24, 2021
6ad8430
Add i18n data attribute to html for chemical aspect
Aug 24, 2021
fe9b575
Add i18n data attribute to html for clinical aspect
Aug 24, 2021
9d710fc
Add i18n data attribute to html for organization aspect
Aug 24, 2021
7c1b4b2
Add i18n data attribute to html for couuntry aspect
Aug 24, 2021
9e791d0
Add i18n data attribute to html for countries aspect
Aug 24, 2021
db210a6
Add i18n data attribute to html for complex aspect
Aug 24, 2021
d2dc9ca
Add i18n data attribute to html for printer aspect
Aug 24, 2021
9fb2f0d
Add aspect in i18n data attribute to several html
curibe Aug 25, 2021
16666f9
add cookie and autolanguage to sparql location aspect
curibe Aug 25, 2021
665c912
fix some flake8 errors for views.py
curibe Aug 25, 2021
3265ae0
add docstring to functions to fix flake8 error
curibe Aug 25, 2021
62d959c
add docstring to functions to fix flake8 error
curibe Aug 25, 2021
694b55d
add cookie and autolanguage to sparql complex aspect
curibe Aug 25, 2021
533c181
add cookie and autolanguage to sparql award aspect
curibe Aug 25, 2021
3bb2461
add cookie and autolanguage to sparql software aspect
curibe Aug 25, 2021
37c7f32
add cookie and autolanguage to sparql author aspect
curibe Aug 25, 2021
6657be8
add cookie and autolanguage to sparql chemical aspect
curibe Aug 25, 2021
c8040b1
add cookie and autolanguage to sparql clinical aspect
curibe Aug 25, 2021
66d5b18
add cookie and autolanguage to sparql organization aspect
curibe Aug 25, 2021
ee4d8a7
add cookie and autolanguage to sparql country aspect
curibe Aug 25, 2021
2a8240d
add cookie and autolanguage to sparql countries aspect
curibe Aug 25, 2021
de92d58
add cookie and autolanguage to sparql printer aspect
curibe Aug 25, 2021
ee04141
add cookie and autolanguage to sparql 404-chemical aspect
curibe Aug 26, 2021
3de57b4
add cookie and autolanguage to sparql catalogue aspect
curibe Aug 26, 2021
0ffe654
add cookie and autolanguage to sparql cito aspect
curibe Aug 26, 2021
53e16b9
add cookie and autolanguage to sparql dataset aspect
curibe Aug 26, 2021
4bf5029
add cookie and autolanguage to sparql disease aspect
curibe Aug 26, 2021
224baca
add cookie and autolanguage to sparql event aspect
curibe Aug 26, 2021
3843beb
add cookie and autolanguage to sparql gene aspect
curibe Aug 26, 2021
5e1b760
add cookie and autolanguage to sparql lexeme aspect
curibe Aug 26, 2021
86f2173
add cookie and autolanguage to sparql pathway aspect
curibe Aug 26, 2021
e4ae7e7
add cookie and autolanguage to sparql project aspect
curibe Aug 26, 2021
bcae022
add cookie and autolanguage to sparql property aspect
curibe Aug 26, 2021
4cb37bc
add cookie and autolanguage to sparql protein aspect
curibe Aug 26, 2021
3f9504c
add cookie and autolanguage to sparql publisher aspect
curibe Aug 26, 2021
4d9f2ae
add cookie and autolanguage to sparql series aspect
curibe Aug 26, 2021
bfa43ce
add cookie and autolanguage to sparql sponsor aspect
curibe Aug 26, 2021
f858adb
add cookie and autolanguage to sparql taxon aspect
curibe Aug 26, 2021
da2598c
add cookie and autolanguage to sparql topic(s) aspect
curibe Aug 26, 2021
18bbd60
add cookie and autolanguage to sparql use aspect
curibe Aug 26, 2021
aedbb2e
add cookie and autolanguage to sparql venue(s) aspect
curibe Aug 26, 2021
a0993cd
add cookie and autolanguage to sparql work(s) aspect
curibe Aug 26, 2021
6423d7d
Add i18n data attribute to html for cito aspect
Aug 26, 2021
1903e3a
Add i18n data attribute to html for protein aspect
Aug 26, 2021
73ab8d0
Add i18n data attribute to html for publisher aspect
Aug 26, 2021
bf45351
Add i18n data attribute to html for event aspect
Aug 26, 2021
070c8ea
Add i18n data attribute to html for gene aspect
Aug 26, 2021
8796df3
Add i18n data attribute to html for lexeme aspect
Aug 26, 2021
09bfe7d
Add i18n data attribute to html for project aspect
Aug 26, 2021
bcd4d60
Add i18n data attribute to html for work aspect
curibe Sep 1, 2021
71969f2
Add i18n data attribute to html for venue(s) aspect
curibe Sep 1, 2021
28883c4
Add i18n data attribute to html for use aspect
curibe Sep 1, 2021
40555b6
Add i18n data attribute to html for topic(s) aspect
curibe Sep 1, 2021
fb483c0
Add i18n data attribute to html for text_to_topics aspect
curibe Sep 1, 2021
c2e20d0
Add i18n data attribute to html for taxon aspect
curibe Sep 1, 2021
2253c78
Add i18n data attribute to html for sponsor aspect
curibe Sep 1, 2021
54fe2d1
Add i18n data attribute to html for series aspect
curibe Sep 1, 2021
32c37d9
Add i18n data attribute to html for search aspect
curibe Sep 1, 2021
837a88a
Add i18n data attribute to h4 html for protein_emptyt
curibe Sep 1, 2021
a7db724
Add i18n data attribute to html for property aspect
curibe Sep 2, 2021
b840d25
Add i18n data attribute to h4 html for project_empty
curibe Sep 2, 2021
198eb9e
Add i18n data attribute to html for pathway aspect
curibe Sep 2, 2021
4dc41ad
Add i18n data attribute to html for index aspect
curibe Sep 2, 2021
594b679
Add i18n data attribute to h4 html for gene_empty
curibe Sep 2, 2021
c6b02ec
Add i18n data attribute to html for disease aspect
curibe Sep 2, 2021
4a02d40
Add i18n data attribute to html for dataset aspect
curibe Sep 2, 2021
fcfa072
Add i18n data attribute to h4 html for complex_empty
curibe Sep 2, 2021
a0f5445
Add i18n data attribute to h4 html for chemical aspect
curibe Sep 2, 2021
db86b4a
Add i18n data attribute to html for catalogue aspect
curibe Sep 2, 2021
4921000
Add i18n data attribute to h4 html for author aspect
curibe Sep 2, 2021
7dae80b
Add i18n data attribute to html for arxiv aspect
curibe Sep 2, 2021
59356a1
Add i18n data attribute to html for about aspect
curibe Sep 2, 2021
3609793
Add i18n data attribute to html for 404_doi aspect
curibe Sep 2, 2021
0f010a3
add lang-cookie in filter-lang to sparql author(s) aspect
curibe Sep 2, 2021
fb1975a
add lang-cookie in filter-lang to sparql award aspect
curibe Sep 3, 2021
c5db12c
add lang-cookie in filter-lang to sparql chemical aspect
curibe Sep 3, 2021
47c7d46
add lang-cookie in filter-lang to sparql clinical-trial aspect
curibe Sep 3, 2021
1b0a1e9
add lang-cookie in filter-lang to sparql complex aspect
curibe Sep 3, 2021
d402210
add lang-cookie in filter-lang to sparql country aspect
curibe Sep 3, 2021
f4dc1c7
add lang-cookie in filter-lang to sparql dataset aspect
curibe Sep 3, 2021
a887065
add lang-cookie in filter-lang to sparql disease aspect
curibe Sep 3, 2021
34bd9a1
add lang-cookie in filter-lang to sparql event aspect
curibe Sep 3, 2021
981d1d4
add lang-cookie in filter-lang to sparql gene aspect
curibe Sep 3, 2021
f521aae
add lang-cookie in filter-lang to sparql organization aspect
curibe Sep 3, 2021
a40de09
add lang-cookie in filter-lang to sparql pathway aspect
curibe Sep 3, 2021
a28af5f
add lang-cookie in filter-lang to sparql project aspect
curibe Sep 3, 2021
53b33d9
add lang-cookie in filter-lang to sparql protein aspect
curibe Sep 3, 2021
33eea71
add lang-cookie in filter-lang to sparql publisher aspect
curibe Sep 3, 2021
e2cf4bf
add lang-cookie in filter-lang to sparql taxon aspect
curibe Sep 3, 2021
ba9b72a
add lang-cookie in filter-lang to sparql topic aspect
curibe Sep 3, 2021
e999a7d
add lang-cookie in filter-lang to sparql venue aspect
curibe Sep 3, 2021
f0e0571
add lang-cookie in filter-lang to sparql work aspect
curibe Sep 3, 2021
92f1a24
add template.json with data-i18n attr of all html files
curibe Sep 3, 2021
0b99ac7
add script to create json and update requirements
curibe Sep 6, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Python package

on: [push]
on: [push, workflow_dispatch]

jobs:
build:
Expand Down
268 changes: 268 additions & 0 deletions i18n_create_json.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
"""
Create json file for internationalitation process.

Script to create json file using data-i18n attributes
inside html files for internationalitation process
"""

import json
import click
import re
import glob
from pathlib import Path
from bs4 import BeautifulSoup
from tabulate import tabulate
from colorama import Fore, Style
from collections import OrderedDict, Counter


def read_file(filename):
"""Read a file an return its content as string."""
with open(filename, "r") as fstream:
content = fstream.read()
return content


def write_file(filename, content):
"""Write into a file."""
with open(filename, "w") as f:
f.write(content)


def read_json(filename):
"""Read a json file an return its content as dict."""
with open(filename, "r") as f:
content = json.loads(f.read())
return content


def write_json(filename, data):
"""Write dict as json file."""
with open(filename, "w") as f:
f.write(json.dumps(data, indent=4))


def show_table(datadict, color, fmt="pretty"):
"""Show dict as a table with tabulate with color."""
print(color)
print(tabulate(datadict, headers="keys", tablefmt=fmt))
print(Style.RESET_ALL)


def print_info(datadict, color, title=''):
"""Show dict info as a table with tabulate with color."""
print(color)
print(f"+{title:-^60s}+")
print(json.dumps(datadict, indent=4))
print(f"+{'':-^60s}+")
print(Style.RESET_ALL)


def create_json_i18n(filename, json_content, verbose=False):
"""Create the dict/json i18n from html content.

Search for data-i18n attribute inside html content
and generate/update the json file following banana format
"""
content = read_file(filename)
soup = BeautifulSoup(content, 'html.parser')
matches = soup.find_all([], {"data-i18n": True})
oldfields = set(json_content)

for tag in matches:
if not json_content.get(tag.get("data-i18n"), None):
json_content.update({f"{tag.get('data-i18n')}": ''})

newfields = {
key
for key in json_content.keys() if key not in oldfields
}
if verbose:
show_table({
"file": [Path(filename).name],
"existing fields": oldfields,
"new fields": newfields
}, Fore.YELLOW)


def normalize(name):
"""Allow click to use command with underscore."""
return name.replace("_", "-")


@click.group(context_settings={"token_normalize_func": normalize})
def cli():
"""Create/update json for internationalitation.

This program allows you to create or update a <lang>.json file
for an internationalitation process, using the banana format. You
can look for the data-i18n atrribute in one or several html files at time
and therefore, extract them and create/update the json file. Also you
can check if there are attributes duplicated in html files before
put in json
file.

To show help for specific command, you can run:

python i18n_create_json.py COMMAND --help
"""
pass


@cli.command()
@click.option('-f', "--file", help="to pass the html file which it will \
be scanned")
@click.option('--output', '-o', help="To give the name of the output json")
@click.option('-i', "--inplace", is_flag=True, help="To create/update the file. Without \
this option, the command is executed in a dry-run mode")
@click.option('-v', "--verbose", is_flag=True, help="To show more detailed information \
about the process")
def onefile(**kwargs):
"""To search all data-i18n attributes inside one html file.

This command allows you look for all data-i18n attributes inside one html
file passed by command line with the option -f/--file and create or update
a json file with these attributes following the banana format.


How to use:

1. To execute in dry-run mode

$ python i18n_create_json.py onefile --file="path/to/file.html"
-o path/to/output.json


2. To execute and replace in-place

$ python i18n_create_json.py onefile --file="path/to/file.html"
-o path/to/output.json -i/--inplace

"""
filename = kwargs['file']
trfile_content = {}
verbose = kwargs["verbose"]

outfile = Path(filename).parent.parent / "static/i18n" / kwargs['output']

if outfile.exists():
trfile_content = read_json(outfile)

metadata = {"@metadata": trfile_content.pop("@metadata", None)}
create_json_i18n(filename, trfile_content, verbose)
trfile_content = {
**metadata,
**OrderedDict(sorted(trfile_content.items()))
}

if not kwargs["inplace"]:
print_info(
trfile_content,
Fore.LIGHTGREEN_EX,
title=f"New content for {kwargs['output']}"
)
else:
write_json(outfile, trfile_content)


@cli.command()
@click.option("--pattern", help="To pass the html files using unix wildcards")
@click.option('--output', '-o', help="To give the name of the output json")
@click.option('-i', "--inplace", is_flag=True, help="To create/update the file. \
Without this option, the command is executed in a dry-run mode")
@click.option('-v', "--verbose", is_flag=True, help="To show more detailed information \
about the process")
def severalfiles(**kwargs):
"""To search all data-i18n attributes inside several html files.

This command allows you look for all data-i18n attributes inside several
html files passed by command line with the option -p/--pattern as a
pattern. You can use the bash wildcards. With this pattern, you can create
or update the json file with these attributes following the banana format.


How to use:

1. To execute in dry-run mode

$ python i18n_create_json.py severalfiles --pattern="path/to/file*.html"
-o path/to/output.json


2. To execute and replace in-place

$ python i18n_create_json.py everalfiles --pattern="path/to/file*.html"
-o path/to/output.json -i/--inplace

"""
pattern = kwargs['pattern']
verbose = kwargs["verbose"]

trfile_content = {}
outfile = Path(pattern).parent.parent / "static/i18n" / kwargs['output']

if outfile.exists():
trfile_content = read_json(outfile)

metadata = {"@metadata": trfile_content.pop("@metadata", None)}
files = glob.glob(pattern)
for file in files:
create_json_i18n(file, trfile_content, verbose)

trfile_content = {
**metadata,
**OrderedDict(sorted(trfile_content.items()))
}

if not kwargs["inplace"]:
print_info(
trfile_content,
Fore.LIGHTGREEN_EX,
title=f"New content for {kwargs['output']}"
)
else:
write_json(outfile, trfile_content)


@cli.command()
@click.option('--path', required=True, help="To pass the html files using \
unix wildcards")
def check_duplicates(**kwargs):
"""To look for data-i18n attributes duplicated.

This command allows you look for all duplicated data-i18n attributes
inside several html files passed by command line with the option
--path as a pattern.

How to use:

1. To show duplicated data-i18n attributes

$ python i18n_create_json.py check_duplicates
--path="path/to/file*.html"

"""
path = kwargs["path"]
files = glob.glob(path)
rx = re.compile(r'(data-i18n\b=\"([^"]*)\")')
content = []
for file in files:
string = read_file(file)
matches = rx.finditer(string)
for match in matches:
content.append(match.group(2))

duplicates = [key for key, val in Counter(content).items() if val > 1]

show_table(
{
"KEY DUPLICATES": duplicates if duplicates
else ["There are not duplicated keys"]
},
color=Fore.LIGHTRED_EX,
fmt="simple"
)


if __name__ == '__main__':
cli()
6 changes: 5 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,8 @@ requests
simplejson
werkzeug>=0.9
urllib3>=1.25.1
feedparser
feedparser
colorama
bs4
tabulate
click
12 changes: 11 additions & 1 deletion runserver.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,20 @@
from scholia.app import create_app

from flask import redirect, request

app = create_app(
text_to_topic_q_text_enabled=False,
third_parties_enabled=True)
app.config['APPLICATION_ROOT'] = '/'
app.config['SUPPORTED_LANGUAGES'] = {'en': 'English'}
app.secret_key='p9uyg7yuwriwjigjergkrgrrrr'

# @app.route('/')
# def redir():
# lang = request.cookies.get('lang_code',None)
# if lang:
# return redirect('/'+ lang + '/')
# else:
# return redirect('/en/')

if __name__ == '__main__':
app.run(debug=True, port=8100)
Loading