Posts

Showing posts with the label technology

Access 2018 Conference Day 2 Afternoon Sessions #AccessYMH

Integrating Digital Humanities Into the Web of Scholarship with SHARE: An Exploration of Requirements Joanne Paterson osf.io/pkvtu Today going to talk about SHARE, ways to use, integrating DH scholarship, emerging themes and initial thoughts. What is SHARE? Schema agnostic approach to aggregate diverse metadata. Community open source initiative. Scholars are doing various things, how can we bring all that together so we can see their body of work and things that are related? ARL initiative started in 2013. Aggregates metadata. Looks at research cycle and various outputs of research. To aggregate metadata, they put out a call to ask someone to help them build this, answered by Center for Open Science. (OSF - looks at research workflow, allows you to collaborate with others and share easily). OSF free and open project support, can work privately or publicly. SHARE - harvested datasets from wherever they're open, metadata about scholarly research - scholar's portal, figshare...

Access Conference 2018 Day 1 Afternoon Sessions #AccessYHM

Data Migration to Open Journal System (OJS) Using R You Young Lee Worked with scholarly communication librarian to move from legacy system to OJS. Migration using R programming language. Wanted to migrate nursing journal Aporia. Internal system didn't support editor peer review, no user friendly interface to upload articles, typical workflow was to receive content and copy and paste metadata using templates and uploading to server. Since OJS is available, time to move on. Approximately 32 issues, didn't want to manually copy and paste. New issue will have updated location. Project goal: how to convert metadata in HTML into XML. Project tool and packages: Use Studio instead of R, provides console and editor, debugging tool and workspace management. Easy to manage datasets and scripts all in one place.  Rcrawler, data.table, dplyr, XML, stringr. Challenge 1: how download data from the website: crawler parses whole website and extracts all data with a single command line. HT...

Digital Humanities Summer Institute #DHSI18 Day 4 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 4 (Morning) Paige Morgan and Yvonne Lam Housekeeping The Shock of the Old (D. Edgerton) - People adopt new tech but the old runs alongside for many reasons. (I'd also recommend The Diffusion of Innovations  on top of this.) Class discussion of D'Ignazio and Klein's "Feminist Data Visualization." Colonialist legacies of tabular data. Even JSON's tree structure is hierarchical. What should we use? Jacqueline Wernimont's Numbered Lives: Life and Death in Quantum Media .  Donella Meadows - leverage points. Instead of massive interventions (which can be most temporary and ephemeral), smaller interventions can be most powerful. Bethany's piece on "The eternal September" of DH - when do you stop explaining yourself to the new but not establish gatekeeping. People have to discover for themselves and discover where they are. --------------------------- Visi...

Digital Humanities Summer Institute #DHSI18 Day 3 [Afternoon] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 3 (Afternoon) Paige Morgan and Yvonne Lam Article "Against Cleaning" Committing to giving certain answers when you are cleaning data. How do I make this material discoverable and allow it to intersect more clearly with discoveries being made in this field. You may feel like you need to tune your data so it gives specific answers. But the more you do is not to get project to spit out answers for people, but give answers that help people rethink.  What is the info I wan to surface for people, how do I get my data to surface that? [Much more concern for how *others* are going to use data here with the digital humanists that in my experience with social science, where we collect our data to answer our questions, then fin. Kudos to DH folks!] Expansion without growth - scalability Who is your audience? Who is relying on your workflow or the decisions you made that you can't explain? T...

Digital Humanities Summer Institute #DHSI18 Day 3 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 3 (Morning) Paige Morgan and Yvonne Lam Standardized rights statements: http://rightsstatements.org/en/ Controlled vocabularies Working with Openrefine Free work time Lunch Reading: Against Cleaning Free work time Tomorrow: Meeting with FemDH Controlled vocabulary: a set of carefully chosen words and phrases used to help structure and define information so that it can be easily returned in a search, or parsed by analysis programs. May be the basis for taxonomies and ontologies; can be hierarchical or restricted in various ways. Ex) Pizza vocabulary.  Crust (deep dish; crispy) Sauce (marinara, alfredo, olive oil) Cheese (mozzarella, Provolone, parmesan) Veggies (mushrooms, green peppers, onions, tomatoes, olives) Meat We can say every pizza must have a crust, must have one or more sauces, etc. Can add another layer and say there are 'veggie pizzas' and 'm...

Digital Humanities Summer Institute #DHSI18 Day 2 [Afternoon] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 2 Afternoon Paige Morgan and Yvonne Lam Find and share datasets at: Figshare Humanities CORE repository Data is Plural Twitter (datset #dataset) GitHub Documentation Data dictionaries a record of what data is and isn't supposed to do, definitions, usage similar to a codebook, used more by folks working with coding languages that define different functions, how was it done in this experiment. Data dictionaries do the same for humanities What are your categories meant to cover? Workflows Set of instructions/rules (doesn't need to be a table, can be a list - what to do for each thing, what not to do) see smartdraw,com For tomorrow: Openrefine.org is free, works on Windows and Linux (use 2.8, not the beta)

Digital Humanities Summer Institute #DHSI18 Day 2 [Morning] Making Choices About Your Data

Making Choices About Your Data Digital Humanities Summer Institute #DHSI18 Day 2 Morning Paige Morgan and Yvonne Lam Clean data vs tidy data Cleaner data is grouped in fewest 'boxes' possible, categories. makes data more interoperable and legible to their agencies. Think 'race/ethnicity' - either few checkboxes/labels, or open where folks can write in anything at all (where running analysis would be difficult). Ambiguity and complexity. Ambiguity is - how does having more or less ambiguity in your data/project affect where the work goes?  Limited categories is legible and understandable to others. If you are studying something that manifests differently among categories, you'd need the 'messier' more detailed data.  machine parsable non machine parsable less accurate more accurate representation of complexity Book recommendation: Sorting Things Out - death causes and diseases data. Dataset originated for people working on merchant ...

Digital Humanities Summer Institute #DHSI18 Day 1 [Afternoon] Making Choices About Your Data

Digital Humanities Summer Institute #DHSI18 Day 1 Afternoon Paige Morgan and Yvonne Lam [ #wrangledata ] Are you in it for the process or the product? Need to be sure you and your tenure committee and chair are on the same page. Ex - Old Bailey online is most successful. Over 300 years of records from London's criminal court. Can search all sorts of facets. Project has several controlled vocabularies for offenses, verdicts, sentences, etc. This successful project was funded by UK grantign agencies that grant within high 6 figures into low 7 figures (pounds, not dollars) - that's the kind of money it takes for a source project. Depending on wher you get news of DH from, you'll hear about different types and aspects of DH. Twitter: cool projects, I'm looking for this kind of tool, omg this tool is failing, small projects and struggling with DH. Not going to hear that from the elite and official sources - if reading from mainly elite official sources, your first o...

Digital Humanities Summer Institute #DHSI18 Day 1 [Morning] Making Choices About Your Data

Digital Humanities Summer Institute #DHSI18 Day 1 Morning Paige Morgan and Yvonne Lam [ #wrangledata ] Goals Spreadsheet of data and metadata you can take to librarian or developer Clearer idea of what research questions you can ask of your data Better sense of what tools would be a good fit for your data; or what you would need to do to your data to make it work better with certain tools Start of specific plans about work that you want to do ON your data So much depends on what you're going to prioritize because you are not going to learn all the things at once. Encouragement ot think carefully and realistically and generously with selves about setting goals of what we're going to learn. Goals, milestones , FemTechNet MEALS Framework Idea is to poke a little bit at assumptions we have about technologies work and good ways of using them, what's an acceptable thing to apply technology to (mostly discussing digital tech). Not only is there this idea of ho...