Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not import tables as both table and figure #2

Open
Daniel-Mietchen opened this issue Jun 14, 2014 · 14 comments
Open

Do not import tables as both table and figure #2

Daniel-Mietchen opened this issue Jun 14, 2014 · 14 comments
Assignees

Comments

@Daniel-Mietchen
Copy link
Member

In some journals (e.g. at PLOS), tables are being made available as image files in addition to tabular format. If the latter exists, we should always go for it and not embed the former in the Wikisource text.

I am open to the idea of importing the image file nonetheless, as it may sometimes be useful in Wikipedia articles, but my suggested default setting would be to ignore those image files entirely.

Sample case:

https://commons.wikimedia.org/w/index.php?title=File:Tracking-Marsupial-Evolution-Using-Archaic-Genomic-Retroposon-Insertions-pbio.1000436.t001.jpg&oldid=126539852 .

@notconfusing
Copy link
Member

what is the way automatically determine this? what clues are given in the text?

@Daniel-Mietchen
Copy link
Member Author

Pinging @Klortho for advice.

@Klortho
Copy link
Member

Klortho commented Jun 16, 2014

Tables are notoriously difficult to render well, and as I'm sure you all know, conversion to/from HTML and other formats often mangles the presentation pretty badly. I wasn't aware that PLoS also provides tables as images, but I imagine that the reason is to provide a reference for how it appears in the published work. Compare the table image above with how it appears in PMC.

So, I'm not actually sure what the question here is. Is it, whether or not to upload the table image to commons, even if we have it in HTML form? If so, I'd suggest yes, because it doesn't seem like it would do any harm.

@Daniel-Mietchen
Copy link
Member Author

As stated above, I'm open to having the table image on Commons. Just want to make sure we do not embed both the HTML and the image of the table into the same Wikisource page.

@notconfusing
Copy link
Member

@Klortho, I guess the question is would be, "is it automatically detectable when both Tabular, and Image formats are available for a table?"is that right @Daniel-Mietchen ? I think there is the <alternatives> tag, like in this fragment from that article. So, if the alternatives tag exists, ignore the image?

<table-wrap id="pbio-1000436-t001" position="float">
<object-id pub-id-type="doi">10.1371/journal.pbio.1000436.t001</object-id>
<label>Table 1</label>
<caption><title>Presence-absence table of the marsupial markers.</title></caption>
<alternatives><graphic id="pbio-1000436-t001-1" xlink:href="pbio.1000436.t001"/>
....

@Klortho
Copy link
Member

Klortho commented Jun 27, 2014

That looks right.

@notconfusing
Copy link
Member

So is this possible to do in the XSL transform? I.e. can the XSL not
transform the table if an image of the table exists, or qill we have to
figure it out post-processing?

Max Klein
http://notconfusing.com/

On Fri, Jun 27, 2014 at 3:36 AM, Chris Maloney notifications@github.com
wrote:

That looks right.


Reply to this email directly or view it on GitHub
#2 (comment).

@wrought
Copy link
Member

wrought commented Jul 26, 2014

I think it is perfectly acceptable that in a display format based on original text (e.g. the mediawiki markup article on wikisource based on the original XML), if the original stores both an HTML table and a raster image of the table, to choose to display both. Both are considered data in the source, and it's only by convention that publishers expose one and not the other. It's a feature for the Wikimedia community to be able to access and use both for various purposes (e.g. one as data, another as display in an article).

Perhaps we should close this issue for now and open one on the JATS-to-mediawiki repository if/when relevant.

@wrought wrought closed this as completed Jul 26, 2014
@wrought
Copy link
Member

wrought commented Jul 26, 2014

also @Klortho sometimes PLOS creates raster images of tables out of pure laziness to get tables to convert and display correctly on the journal website, satisfying journal editors and authors.

@Daniel-Mietchen
Copy link
Member Author

Reopening.

At the moment, we do render the table and post the embed code for the table's image, but since we do not actually upload the image file, this produces redlinks. These should go, so I suggest removing the embed code for such table figures.

@Klortho
Copy link
Member

Klortho commented Aug 29, 2014

So this is a JATS-to-MediaWiki issue, then, right? I created wpoa/JATS-to-Mediawiki#37

@wrought
Copy link
Member

wrought commented Sep 15, 2014

Okay, so the issue on our end is that currently, no table image files are uploaded. This includes an "image only" case and the "HTML + image alternative" case. We can choose to ignore raster versions of HTML tables, or face the task of uploading table images to Wikisource instead of Commons.

@Daniel-Mietchen is that accurate? Do we need the table image files or would you say forgo them?

@Daniel-Mietchen
Copy link
Member Author

Forgo the raster images if we have the html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants