People who work in AI and libraries/archives/museums, we need your help! 👋🏻
A few of us maintain an "awesome-ai4lam" 🕶️ list at https://github.com/AI4LAM/awesome-ai4lam and we need your help finding more things to add. Please tell us what we missed!
You can just reply to this toot, or open an issue/ticket in the GitHub repo, or email me, or whatever is easiest for you.
Umfangreiche französischsprachige Quellenkorpora des Mittelalters maschinell erschließen?
Im nächsten #DigitalHistoryOFK nimmt Pauline Spychala (DHI Paris) die Texterkennungsplattformen #eScriptorium & #Transkribus unter die Lupe. Ziel ihres Projektes ist die Entwicklung eines Workflows, der beide Tools effektiv kombiniert, um u.a. den Eigenschaften der untersuchten Quellen gerecht zu werden.
The next PATT meeting will be about teaching automatic text recognition methods to researchers in the #humanities. Interested people are welcome to join (sign up here: Alicia Schümperli, sekteuscher@hist.uzh.ch). #HTR#digitalhumanities
Today, we at the National Archives have officially released an open base-model for handwriting recognition. It works best on Swedish manuscripts from about 1650 - 1900.
By basic model it is meant that it has two intended areas of use:
To HTR large amounts of images of handwritten text with good enough quality to index the text for search.
To function as a starting point for using own training data to create more specialized HTR models.
Dominique Stutzmann s’est adressé à l'IHA aux jeunes historiennes et historiens qui souhaitent utiliser la reconnaissance automatique de texte. Il s'agit de comprendre l'évolution et son impact sur la science historique. Vous avez raté ça? L’enregistrement audio est maintenant disponible sur notre site web!
OK, I've finally got round to transcribing enough pages of my own handwriting to train up a model with #Transkribus, and the results are surprisingly good! I expected to need more than the minimal 25 pages to get a decent level of accuracy but it's already noticeably better than the generic recognition on my reMarkable tablet or OneNote.
Since #PyLaia is open source, it should be possible now to recreate this training on my own desktop with the same parameters, and apply the model to recognise new pages, and from there figure out a workflow to simplify getting handwritten notes into plain text for reference or publication.
I'm David, once an archaeologist and classicist, pretty much always a nerd. Currently working at the Swedish National #Archives leading its little Digital Experiences team.
I hear a frequent complaint about applying quantitative methods on texts that have been through #HTR tools, such as #Transkribus, that the expected error rate means that you will miss too many occurrences of the word you are looking for. (1/n)