Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode read content as paragraphs, not as lines #24

Open
jjokisch opened this issue Nov 1, 2024 · 2 comments
Open

Encode read content as paragraphs, not as lines #24

jjokisch opened this issue Nov 1, 2024 · 2 comments

Comments

@jjokisch
Copy link
Contributor

jjokisch commented Nov 1, 2024

Though Calerdón's plays are generally written in verse, they often contain read passages (usually in the form of letters), which are in prose. The original versions clearly mark this (widening the lines, breaking with the metre, starting a new line in the middle of a word, etc.). Currently, these prose passages are encoded as <l>, when they should really be <p>. We should fix that.

I think there are two relatively easy ways to find these passages somewhat consistently:

  1. Look for stage directions that clearly state "lee" (or any of the other reasonable forms of "leer": leen, leyendo, etc.)
  2. Look for direct quotes in the dialog («», ””, "")

Afterwards, all we have to do is determing if there is a read passage in prose and then change the encoding. As the line beginnings are not consistent between editions (after all, why would they be for prose passages?), we wouldn't even have to encode the <lb/>. As a rough estimate, there are 247 hits for "[Ll]ee" in the stage directions in a total of 85 of the plays.

@jjokisch
Copy link
Contributor Author

jjokisch commented Nov 5, 2024

The issue might be more severe. At least one file, la_dama_duende.xml, simply skips all read passages. The initial stage directions "lee" exist, but the paragraphs following them are missing in their entirety. For our work, this means that the absence of a read passage in the files does not automatically mean a silent act of reading, but most likely stems from an encoding mistake.

@arojascastro
Copy link
Collaborator

Let me see if these paragraph is missin in the original XML files v.1. I will get back to you soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants