Posts

Showing posts with the label digital humanities

Access 2018 Conference Day 2 Afternoon Sessions #AccessYMH

Integrating Digital Humanities Into the Web of Scholarship with SHARE: An Exploration of Requirements Joanne Paterson osf.io/pkvtu Today going to talk about SHARE, ways to use, integrating DH scholarship, emerging themes and initial thoughts. What is SHARE? Schema agnostic approach to aggregate diverse metadata. Community open source initiative. Scholars are doing various things, how can we bring all that together so we can see their body of work and things that are related? ARL initiative started in 2013. Aggregates metadata. Looks at research cycle and various outputs of research. To aggregate metadata, they put out a call to ask someone to help them build this, answered by Center for Open Science. (OSF - looks at research workflow, allows you to collaborate with others and share easily). OSF free and open project support, can work privately or publicly. SHARE - harvested datasets from wherever they're open, metadata about scholarly research - scholar's portal, figshare...

Access Conference 2018 Day 1 Afternoon Sessions #AccessYHM

Data Migration to Open Journal System (OJS) Using R You Young Lee Worked with scholarly communication librarian to move from legacy system to OJS. Migration using R programming language. Wanted to migrate nursing journal Aporia. Internal system didn't support editor peer review, no user friendly interface to upload articles, typical workflow was to receive content and copy and paste metadata using templates and uploading to server. Since OJS is available, time to move on. Approximately 32 issues, didn't want to manually copy and paste. New issue will have updated location. Project goal: how to convert metadata in HTML into XML. Project tool and packages: Use Studio instead of R, provides console and editor, debugging tool and workspace management. Easy to manage datasets and scripts all in one place.  Rcrawler, data.table, dplyr, XML, stringr. Challenge 1: how download data from the website: crawler parses whole website and extracts all data with a single command line. HT...

Discovery, Collaboration, and Dissemination: Lessons Learned and Plans for the Future #DHSI18

Discovery, Collaboration, and Dissemination: Lessons Learned and Plans for the Future Digital Humanities Summer Institute William R. Bowen Iter: Gateway to Middle Ages and the Renaissance . Just passed 20th anniversary, looking forward to 25th. Iter Bibliography, Community, and Press. Iter's mandate is online, Iter meaning a journey or path in Latin, not-for-profit, advancement of learning in study and teaching of Middle Ages and Renaissance through the development and distribution of online resources. Created 1995, incorporated 1997 as a nonprofit partnership. Academic society partners (CSRS, ISAS, MAA, MOISA, RSA, SCSC); projects (DHSI, ETCL, INKE, IRCPS), research centers (ACMRS, CRRS), faculty of information studies (Toronto), U of Toronto Libraries. Marriage of expertise in subject area with info studies and new technologies. Iter planning. Many planning exercises, collaboratories. Inital Steps, Following a Larger Vision: A Feature oriented Pilot Proposal (APril 2009). D...

Documenting Born Digital Creative and Scholarly Works for Access and Preservation [Day 1 p.m.]

Digital Humanities Summer Institute Workshop #2 Documenting Born Digital Creative and Scholarly Works for Access and Preservation Afternoon Day 1 Documenting the Experience of Early Digital Literature: Pathfinders In 90s, understood physical/material thing and ephemeral thing. Thought that digital is immaterial. Not good thinking. In 2002 W riting Machines  notion of digital work not being immaterial. Now we understand digital material has material component, bits rot, etc. Reading: Christyann See 3 types of preservation: emulation, migration, collection. She prefers emulation because is more elegant (she's an artist) - last thing a gallery want to do is show wires and plugs, that's the aesthetic of a traditional gallery. Having computers in a gallery space is counterproductive. Emulation makes sense for those who don't want to show the muck to the public, the bits and pieces that make things work. She doesnt like migration. Really hates collection which is not an e...

Documenting Born Digital Creative and Scholarly Works for Access and Preservation [Day 1, a.m.]

Digital Humanities Summer Institute Workshop #2 Documenting Born Digital Creative and Scholarly Works for Access and Preservation Morning Day 1 By Dene Grigar and Nicholas Schiller Syllabus In STEAM - virtual game platform, can put your game up and sell it (Apple store for games), game sits ephemeral on site, do not own it physically. Beyond Eyes Game Tags you would use: third person/God view , game, multimedia, interactive, juvenile, non-violent, indiegames [independent developers], visual novel, visual storytelling, blindness, memory, No one path, how do you represent that in documentation? Walkthroughs try to do that. Limited There's no practical limit to describing things (used to be 3, as many as you could fit on a catalog card). Translation studies: translators betray the text no matter what they do kakamoron - a bad, stupid thing (doesn't capture the stupidity and the badness in the English translation. We are translating for a future audience we don't ...

Morning Workshop: Regular Expressions (Digital Humanities Summer Institute #DHSI18)

DHSI Morning Workshop: Regular Expressions by John Simpson Description:  Regular Expressions are a powerful tool for searching text to find patterns of characters. They are often used to extract postal codes, phone numbers, and emails from large sets of documents and when combined with a little bit of scripting they can turn tedious and error prone work done “by hand” into fast, effective, and automatic searching. In this workshop you will learn the basic syntax for regular expressions and deploy them to extract useful information in cases where doing it “by hand” would be tedious. Point browser to  https://regex101.com/ and to gutenberg.org/ebooks/13 Text version of The Hunting of the Snark. Most of the workshop should be discussion dialog. cwrc.ca/rsc-src Regex good for matching patterns of characters A PDF document in background is a lot of XML, lot fo stuff is not helpful, lots of XML vomit of individual lines, but can use to zoom in on a particular piece of tex...

DHSI Colloquium Day Conference (Digital Humanities Summer Institute) - Afternoon

Building, Analyzing, and Mapping Building the ArtTechne Database: New Directions in Digital Art History - Marieke Hendriksen ARTECHNE: Technique in the Arts, 1500-1950. What is technique in the arts? Concept of technique (technik) Google NGrams to track rise of term in relation to another term Aim - database - digitized searchable historical texts; linked open data to link to images and soundbites and dbs of chemical analysis of artworks; search and visualization tools; integrate orphan databases; serves broad community. Ex) database for pigments and paints on server on Planck Institute in Berlin, dead project, eventually will disappear.  Chhosing a data warehousing approach - Drupal (open, fast, multilingual); chose XML as format, bc W3C recommended and free; data warehousing approach; GettyIDs ARTECHNE ontology - enter texts, divide into records (chapter, paragraph, or recipe, persons/authors, translators, etc). Most sources now geographically indexed, timeline functio...

DHSI Colloquium Day Conference (Digital Humanities Summer Institute) - Morning

People Documenting Online Lives This is Just to Say I have the in Your : Modernist Memes in an Era of Public Apology by Shawna Ross (Texas A&M) Humanities Commons - the paper is available there. Trigger warning - evocation of people who are known abusers, racists, harassers--not what they've done, but their apologies and what they sound like. William Carlos William "This is Just To Say" was meme-ed on Twitter, blew up in Nov 2017. Proliferation of mashups. Why did this one blow up? Why not his "So Much Depends" which is fewer characters? Why is the shortest story #babyshoes meme mashup with the plums is more popular than #babyshoes alone? Why the surge? Poem's accessibility. Lack of meter and rhyme scheme makes it easy to understand and replicate. Compulsive overeating subject is attractive. Desired consumables - happened between Thanksgiving and Christmas. Wheelbarrow is less seductive than plums. Also people can finally use line breaks in Twitt...

Digital Humanities Summer Institute #DHSI18 Day 4 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 4 (Morning) Paige Morgan and Yvonne Lam Housekeeping The Shock of the Old (D. Edgerton) - People adopt new tech but the old runs alongside for many reasons. (I'd also recommend The Diffusion of Innovations  on top of this.) Class discussion of D'Ignazio and Klein's "Feminist Data Visualization." Colonialist legacies of tabular data. Even JSON's tree structure is hierarchical. What should we use? Jacqueline Wernimont's Numbered Lives: Life and Death in Quantum Media .  Donella Meadows - leverage points. Instead of massive interventions (which can be most temporary and ephemeral), smaller interventions can be most powerful. Bethany's piece on "The eternal September" of DH - when do you stop explaining yourself to the new but not establish gatekeeping. People have to discover for themselves and discover where they are. --------------------------- Visi...

Digital Humanities Summer Institute #DHSI18 Day 3 [Afternoon] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 3 (Afternoon) Paige Morgan and Yvonne Lam Article "Against Cleaning" Committing to giving certain answers when you are cleaning data. How do I make this material discoverable and allow it to intersect more clearly with discoveries being made in this field. You may feel like you need to tune your data so it gives specific answers. But the more you do is not to get project to spit out answers for people, but give answers that help people rethink.  What is the info I wan to surface for people, how do I get my data to surface that? [Much more concern for how *others* are going to use data here with the digital humanists that in my experience with social science, where we collect our data to answer our questions, then fin. Kudos to DH folks!] Expansion without growth - scalability Who is your audience? Who is relying on your workflow or the decisions you made that you can't explain? T...

Digital Humanities Summer Institute #DHSI18 Day 3 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 3 (Morning) Paige Morgan and Yvonne Lam Standardized rights statements: http://rightsstatements.org/en/ Controlled vocabularies Working with Openrefine Free work time Lunch Reading: Against Cleaning Free work time Tomorrow: Meeting with FemDH Controlled vocabulary: a set of carefully chosen words and phrases used to help structure and define information so that it can be easily returned in a search, or parsed by analysis programs. May be the basis for taxonomies and ontologies; can be hierarchical or restricted in various ways. Ex) Pizza vocabulary.  Crust (deep dish; crispy) Sauce (marinara, alfredo, olive oil) Cheese (mozzarella, Provolone, parmesan) Veggies (mushrooms, green peppers, onions, tomatoes, olives) Meat We can say every pizza must have a crust, must have one or more sauces, etc. Can add another layer and say there are 'veggie pizzas' and 'm...

Digital Humanities Summer Institute #DHSI18 Day 2 [Afternoon] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 2 Afternoon Paige Morgan and Yvonne Lam Find and share datasets at: Figshare Humanities CORE repository Data is Plural Twitter (datset #dataset) GitHub Documentation Data dictionaries a record of what data is and isn't supposed to do, definitions, usage similar to a codebook, used more by folks working with coding languages that define different functions, how was it done in this experiment. Data dictionaries do the same for humanities What are your categories meant to cover? Workflows Set of instructions/rules (doesn't need to be a table, can be a list - what to do for each thing, what not to do) see smartdraw,com For tomorrow: Openrefine.org is free, works on Windows and Linux (use 2.8, not the beta)

Digital Humanities Summer Institute #DHSI18 Day 2 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 2 Morning Paige Morgan and Yvonne Lam Clean data vs tidy data Cleaner data is grouped in fewest 'boxes' possible, categories. makes data more interoperable and legible to their agencies. Think 'race/ethnicity' - either few checkboxes/labels, or open where folks can write in anything at all (where running analysis would be difficult). Ambiguity and complexity. Ambiguity is - how does having more or less ambiguity in your data/project affect where the work goes?  Limited categories is legible and understandable to others. If you are studying something that manifests differently among categories, you'd need the 'messier' more detailed data.  machine parsable non machine parsable less accurate more accurate representation of complexity Book recommendation: Sorting Things Out - death causes and diseases data. Dataset originated for people working on merchant ...