Friday, September 29, 2017

Access 2017 Conference Day 2 Notes Sessions 4-7 #accessYXE

Session 4: IIIF You Can Dream and Not Make Dreams Your Master - IIIF In The Real World - Peter Binkley

IIIF (Triple-I Eff) - provides above the tiled image level, a whole presentation level - a book reader like Internet Archive, but manifest itself is a lovely structured package of info about a digitzed object that can be used for different purposes. Useful beyond the eyecandy of slides.

See - learn 90% of what you want to know here

About 100 participating institutions in IIIF community like Internet Archive, Getty. Also a github/iiif repository and worth looking in there called awesome-iiif which is the community-maintained list of resources and demos and links, etc.

Goals of IIIF - provide unprecedented level of uniform and rich access to image-based resources for scholars; define APIs that support interoperability between repositories; provides a world-class user experience in viewing, comparing, manipulating, and annotating images. Annotating - very exciting! Use open annotation specification to allow users to create annotations that you let them view.

A family of APIs: Image API does the tiling - google maps effect of infinitely deep zoom into image via detailed tiles cut from image so not downloading huge image all at once. Presentation API - most interesting and impactful - metadata package with structural metadata about items you're presenting, can link to descriptive metadata, can only embed as name value parents as labels - you shouldnt be doing structured metadata here. Search & Authentication APIs are most recent additions. Search intended to allow you to drive a full text search of document based on a server-side index. IIIF client can run queries against your items. Authentication API for those required to authenticate access to images and tiles, example from Canadiana. All of these APIs based on json-ld. Most promising - every time presenting book through IIIF you are creating linked data. URIs available for use beyond IIIF system.

Tiled image layer - depends on a specific URL pattern - build URL with 5 elements appended to identifier of image you want (region [cropped tile area], size[1024], rotation [0], quality [default], format [jpg]) where [] are examples). As it creates the tile you've asked for, applies operations in order listed. Not often making these by hand, but by javascript client running in browser. Example of Princeton and their illuminated manuscript shown in demo.

Need to put processing power behind it to meet requests. All of the info a IIIF client needs to determine what it should ask for is encapsulated in a file called info.json - have one for every image in collection, sometimes created on the fly. It tells the application what the options are. Client knows what server can do, client needs to make requests server can actually serve based on the menu of options. This is linked data. Json-ld: Linked data expressed in JSON - subject, predicate, object. JSON reasonably concise but is linked data, full RDF, this is a good thing.

3 levels of compliance - 2 has more possibilities and features it supports. Level 0 specifies a level of functionality that can be satisfied by static files all pregenerated, and client will only ask for those limited images to present to user - low level of technical commitment to get on the IIIF bandwagon. He has written Jekyll plug-in to let you throw hi-res images in and get a level 0 IIIF presentation for static log and it pretty much works.

Presentation from Book of Hours of Geneve, Bibliotheque de Geneve - zoom, side by side page view, click on info and see embedded metadata of simple name value pairs. If want to compare, the Mirador (sp?) client allows to open two side-by-side.

All driven by shared Canvas model - specification allows to describe each image as a canvas onto which we paint other things. There might be text annotations we want to overlay or highlight, count as annotations going onto a canvas [Personal note: Gah! This might work for Island stuff]. Miniatures cut from medieval manuscripts and at other libraries can be overlaid digitally and restore manuscript by painting into coordinates. Once you go IIIF a third party can pull tiles and publish.

Responsibility to maintain URIs must be useful as basis of open annotations others will make, which may be on other servers; if reorganize and break URIs, is disservice to scholars, so we must treat it as linked data.

Session 5: Cantaloupes and Canvases: Adopting IIIF Specifications - Sascha Adler

Why is IIIF at Canadiana (pipeline from digitization to preservation to access, repository, provide access, some collections free, some subscription based, specialize in early Canadian history). Why adopt it?

Content Server (COS) serves derivative images and direct file access (full pdfs). Everything else goes through CAP which does everything - presentation, search, user and institution authentication, admin tools, CMS, administrative tools, etc. Tough to maintain and update. Move from monolith to modules - smaller interconnected services to improve codebase. IIIF helps because multiple services will be publicly accessible, would be nice to have open internationally adopted standards. Replace or modules with community built software so maintain less code. Also want to provide content with public APIs, right now can only access through web interface. Also helps think about new features - annotations are very powerful, they would love transcriptions for written collections where OCR doesn't work, search term highlighting on top of images, and the IIIF annotation spec allows us to think about this. Right now they don't do tiled image rendering, but IIIF standards adopted and put in a viewer and they'd be good to go.

6 month progress so far:

Cantaloupe - derivative image generation
Written in java, handles jpegs, tiffs, pdfs, jpeg2000, easy to configure but watch out for properties file and

see config and Docker image at

Direct file access spun out to separate service, CAP image viewer rewritten to make requests with IIIF

converting series to collections, and items to manifests for IIIF - output can be plugged into IIIF compatible viewers.

Authentication: cantaloupe is configured to only handle authorized requests, easy to configure. Full integration of IIIF authentication API will take some more work. Given a user and an image, what rights does request have (doable but will take time).

Next - Authentication API, convert internal presentation output to IIIF, Evaluate IIIF viewers, internal data models for annotations (explore and develop models)

see for their work

Session 6: An Open Source Approach to Proactively Manage EZProxy - Jason Zou 

Cancelled but Slides available here - Session author unavailable

AbstractIt is a daunting task to manage EZProxy without any additional tools. Especially, when various vendors notify you that they have to cut off your institutions' access, because of massive or illegal downloading. Where to start? The presenter will share his experiences of using an open source approach to proactively monitor and manage EZproxy.

Session 7: Kenny: A Web-Based Tool for EZProxy Authentication Management

Web based login management tool. Kenny? Design decisions are an important part of any project: what server platform, what app framework, what license model. but NAME. Manage Log ins... loggins...Kenny. Loggins.

Problem: Academic libraries are constant target for academic theft. User accounts compromised, probing from servers and bots. Occasional need to override a user's level of access (demoting from current student to alumni status, for ex). We run EZProxy to manage this. EZProxy and other servers are under constant attack by big organizations, not little users, to harvest data we license access to. They group database by access status, but occasional need to override user's access.

But EZProxy! EZP does privide some ways to reject IPs and override access groups folks are assigned to. RejectIP directive, a user.txt file (baduser:deny). Disadvantages - to make changes, those in charge need SSH access to the EXP server and permission to write those files. Changes in Config.txt require EZP daemon to be restarted. Root privileges? Yeah right, not giving that out. Also no record of who blocked what user on what day and why.

If others are working with patrons to reinstate access, they would need to kick support ticket to IT to fix, extra delay and a disservice. Kenny as solution. Web based tool to allow any authorized member manage blocking, or overriding the assigned EZproxy groups, for patron accounts and IP ranges. Doesnt run on EZP server, doesn't require staff have access to server with SSH or root access. Uses standard authentication method supported by EZP, so no reboots of EZP needed. Designed to capture complete history of who blocked, when why and by whom. If configured properly, Kenny can terminate sessions in the EZP server (blocking user only half problem; if an open connection, even blocking user doesn't impact as long as session is open)

EZP supports 2 custom authentication methods, CGI and External, Kenny makes use of external (after CAS or LDAP and acts as filter).

Default list is all currently blocked users. can import text file of blocked users into Kenny.
Great for those who deal with compromised accounts in EZProxy.

PHP application, developed using Laravel 5.4 framework. Bootstrap, with support for Bootswatch themes. Session management accomplished via HTTP requests to standard EXP web interface. Open source, coming soon to GitHub

Access 2017 Conference Day 2 Notes Lightning Talks #accessYXE

Slack In/Out Board Integration
White board to keep track of people in/out/messages. Names of retired, people need to be reordered. No indications of when status was made, need to change date, sigh.
/slack emoji puts in your status. Monitoring bot allows you to see server status. Took Slack status with Slack AIP and messaging, can push to web page, new status board as monitor on the wall. Encouraging people to use Slack, date updates automatically, shows time of status update, can be updated remotely.
How works? User updates on slack, moves to campus server visible to outside world uses server-sent data to grab update.

Making Breaking Up Easier - Krista Godfrey
Change is hard, we can make it easier. UX and usability can help fix - test, test, test. Test always and often. test during RFP, test other instances, training (you are the first tester), sandbox (use staff and users), production (doesn't stop after launch).  Terminology, back end, interface.

Usability/UX testing - A/B testing (other library instances, new vs old, terminology). Scenario based testing. Love/hate letters and how will new system improve it and how will you use your migration to do testing. Must make it a priority and include in the migration process at the various points.

Andrew Nagy w/EBSCO Director of Software Innovation Folio - Open Source Library Services Platform

Collaborative project with universities and vendors to build a platform. It is a platform for development, doesn't have to be an ILS, can be a development platform to work from, data storage, SAS based routing of requests. Doesnt have to be an ILS replacement -

Innovation Challenge: $100,000 to give out to libraries. If want to build new tech or working on something and could use additional funding to help support--caveat - must be built on the Folio platform. $5-$20k awards. UIUC - equipment loan form and HOOT wait list service; Villanova VuFind. - don't need developers, there are folks interested you can partner with

Child Care at Conferences - Amy Patterson
Last Access, Kris Berg got pushback saying that you can get ore women in tech by taking down Star Trek posters. But a small thing you can do is consider childcare needs. Childcare concerns concern women in general more than men, and they affect women in more vulnerable positions disproportionately. Childraising coincides with precarious positions in early career. The why is obvious, the how is more complicated. Licensing, costly. No message is a message to stay home.

Megan Stecyk - Digital Citizenship in Curricula
Kindness online. If kids don't develop empathy by a certain age, will they be a troll forever?

Dale Askey on Leadership and Participation
If you are here, you care about applying and developing technology. If you self define as an administrator, raise your hand. (Few.) If not admins, how many have this in your vague career goal. Fairly few. How many of all participate actively in recruitment and hiring  of higher admin (lots, halfhearted, but not everybody). Everyone should be. Disgruntlement and dissatisfaction by LibIT of not being part of core strategy. Legacy administrators whose consciousness formed in a preGoogle telnet greenscreen era and have a stranglehold on library leadership. Needle hasn't moved a great deal. It means we don't move forward and evolve fast enough for the academy or our communities. Urge admins to show up where technology is being talked about, here or elsewhere. CNI is a great one, but challenge them to go past that. if they don't go, come back and advertise it to them. Consider leadership yourself, be the change you want to see, blah blah, but it's true. And TAKE PAERT IN SEARCHES. Take a more active role.

Peter Binkley - WPA Union Catalog Project 1923(?)
(WWII workflow). Microfilm, film slides. Union catalog. ILL support, so only known items, author cards and serial cards. Find the cards and tip up so microfim technicians can see them easily. Recordak cameras for photographing file cards, get 16mm microfilm out of it. From lab, 100ft lengths of microfim with cards, now need to type them up. Stamping of cards, filed into union catalog. Extra cards back to source catalog (handwritten). Microfilm is great for this - a card with nonRoman characters can't go through typewriters; a card with a punched notch in the microfilm moves to fancy typewriter person in workflow.

Sara Allain - Acceptance Tests: Using Gherkin Syntax to define syntax
To simplify feature development process. Gherkin - acceptance test. Lets you describe software behavior without detailing how that behavior is implemented. Perfect for non-developers who need to define how a feature will work from user's perspective. Plain text, indents and line breaks. No markdown or markup. They use Googledoc. Serves two purposes: Documentation - before start a feature, have skeleton of documentation even before there's a feature to play with. Provides basis for automates tests (Archivematica and [other] does not, which gives huge burden of functional testing for applications before release). Developers use exact feature files to make automated tests. Lightweight and more focused than user stories.

Clear enough that can share with collaborators (this is an encrypted AIP project).

Developers closely involved with feature scoping from the beginning, this is collaborative, not just giving them a document to work with. They use Selenium and behave to automate tests. Everyone has a better understanding of what the feature will or will not do.

Arduino - Cody Fullerton at U of M
Liaison librarian in science and socsci/hum library. Technology loans--students don't bring with them or don't have them. Computer science students need arduino, etc. Will be only in engineering/compsci library; documentation online for those who may not be familiar. Will test this and next semester, might be jumping point to lend others like RaspberryPis.

Open Access and Cost Per Use - Ryan [?]
"The State of OA" article available (might be preprint). Since 2015 45% of journal articles are open access. But we're still paying! Budget pressures, serial crisis. There has to be a way we can be more cost effective. paper IDed 4 types OA - Gold (author pays), Green (author places in repository), hybrid OA (some gold, some not, author pay), Bronze (OA journals that made items accessible, no copyright info, looks OA but status might change). When libraries are looking at journals we do cost per use, but a bunch of open access material is caught in there and not obvious. OA DOI will find an open access version of the article, picks up hybrid, repositories, bronze. OADOI just released v2 with ISSNs of journals. See what percentage of journal is open access. Bronze OA is still frustrating, temporary, don't know what's going on. We need an index that commits publishers to keeping that material open so always available in knowledge base, like DOAJ standardized Gold OA.

Access 2017 Conference Day 2 Notes Sessions 1-3 #accessYXE

Session 1: The UX of Online Help - Ruby Warren

About 2015-16, web redesign. Back to basics - usability testing was already done, but it was a more fundamental issue. Did interviews about library's website with different user groups - UG, grad students, faculty, regional folks. What they go for, what they do, when it happens. Only when they have a problem that needs fixing and they cannot wait anymore (midnight, weekends, weekend midnights). Needed asynchronous help option. Internally called Help Hub - series of 55 videos and text tutorials arranged according to usergroup (U of Manitoba Libs) built in LibGuides. Usergroup appropriate language.

After 8 months, do they use it? Everybody hates new things. Yes, they did. Spike in September, usage follows pattern of academic year. They're going to it, but does it work? 9 usability tests (high) - ensuring users can get to help area, navigating to tutorials should be intuitive, language makes sense; 35 interviews (12 currently complete) (also high) - participants asked to compare content of videos (different styles), video vs text, general questions regarding online learning (approach, resources, etc).

Results: Terminology issues-no clear links to talk to a person (make an appointment was in 'about us', no one could find that), some videos didn't have intuitive titles, (new item vs known item), help area didn't have help in the title (called "Library Services" and then by user groups). Video length issues (won't watch a 9 minute video even if they know they need it); top of page navs remain invisible. Old content ruins literally everything (people WILL find it even when it's not linked to website). User group divisions do seem intuitive and users liked them, know it's relevant to them.

Interviews: video> text and images > interactive > just text
Interactive content viewed as best for learning, but will only use only if mandatory. (!)Seek out video first. Most expressed need for video, text, and images to all be available because there are different situations where they need different things. Prefer video, but if figuring out something in public with no headphones, need option. Contextually dependent. Want video first time, but later will want text or images.

In video - preferred clickthroughs to cutesy design cartoon. They prefer unobstructed views of the screencaps practically demonstrating what they need to do. More boring, but slower speaker, clarity, and clear screen was preferred. Preference for consistent audio-same voice, same kind of audio. Content chunked into small tasks (under 3 minutes)--chunking is important because anything longer than 3 minutes they balk at. But also get annoyed when video doesn't teach them everything they need to know. Exception is when you can daisychain videos. Text: make it Buzzfeed: easily skimmable.

Adding panel to each of tutorial videos to facilitate daisy chaining.

Q&A - there are users who search for library videos *directly* in YouTube

Session 2: Can Link: A Linked Data Project for Canadian Theses / Cana Lien: Un Project de Donnes Liees pour les Theses Canadiennes - Rob Warren

Canadian Linked Data Initiative (CLDI) loose collaborative of academic libraries in Canada in 2015. Goal of CLDI is to work together to plan, facilitate, seek funding for linked data projects and grow skills. By involving many, bridge units in libraries, work across different types of libraries, grow expertise, bridge between Canadian, US, European linked data initiatives. CLDI consists of several working groups and steering group.

Can Link/ Cana Liens - highlight intellectual contributions of grad students in Canada through focus on theses to demonstrate power of linked data to serve as unexpected connections spanning space and time linking individuals and topics of study.

Asked for as many records as possible, authority data (URIs), MARC, XML, CSV for initial bootstrap of 5,000. Addition of authority files and accessible digital objects when possible. Linked Open Data approach, setup used as many formats as possible. 90% use case is citation or getting thesis to read. Fully dereferencable URIs. I

It is ontology backed--computers and people are the same in that they are not psychic. If you are going to share data, must tell computer and people how things work and how you see the world. (Ontology slide is amazing.) FRBR, LOC, FRBR, DOI, ORCID.. (get this slide)

Infrastructure: also get this slide. Daily data dumps, twitter pushes, GitHub ticketing integration so librarians can decide correct action and no command line work to resolve issues. Code available on web - PERL, Django, Python...all of the code is available.

Core functionality: web tool for transofrming MARC data. Can upload set of records and will transform according to ontology, go to GitHub to address and correct issues, adn will push back into the triplestore on the server to lower technology expertise barrier. Very basic UI set up, can do sparql queries.

Dr. Shiv Nagaragan "A thesis is a write many, read once document." Long term value of theses created in Canada, exponential increase in theses and dissertations over past 20 years and there's a lot of vaue based in those in terms of spend per student, not to mention intellectual value.

Notes on data quality from MARC records. Lots of creativity in the MARC records, a concern because no one has looked at it since creation, and we see bit rot just at data level. qualifiers like "electronic thesis" added to title, or supervisor note made in author field, makes these unretrievable, increases cost of retrieving record, no one has the computational time to clean up this data so STOP IT. Records differ in level of richness or detail, MARC records have different data than repository records.

Next steps are to enhance UI, looking for long term home of project, and developing better data quality checking. Zamboni processes to rebuild records with missing info like pages, etc.

Every library should have its own sparql server to maintain theses. If you want to clone and put into your own university, contact Rob, he will help.

Perfect point to add ORCID IDs, but would it require permission from individual. UBC looked for ORCID IDs and included when they found them, but ran into challenge in terms of amount of effort. Repository records often richer in metadata, so often use those instead of MARC records.

Session 3: User Experience from a Technical Services Point of View (UX for Technical Services) - Shelley Gullickson and Emma Cross

UX of students doing academic research online, then looked at that UX from a tech services perspective.

Why? TS and UX don't generally go hand in hand. TS finding it difficult in prioritizing work of staff. What work would have biggest impact on users? What makes a difference to our students? Student-led, exploratory. Where search for info, how search, how deal with search results in terms of what they were looking at. Students search for something they need for research project (not specific), asked to do what they would normally do even though weird to be watched; also explicitly told them they shouldn't feel like they had to use library resources if they normally wouldn't. Think-aloud when searching, so prompted with questions when quiet. To logical end, 15-45 minutes, most 20-30. Emma took notes and observed, but used video capture which was important because students worked fast. Coded key themes:

1. Overwhelming use of single search box - Summon or Google Scholar, not so much the catalog. Not much difference between UG and grad students (except only UGs used catalog). Don't extrapolate, qual study looking for patterns.

2. Popularity of the Get It button and link resolver in general. Students very verbal about liking this. Link resolver presented no problems for them. Research from other uni libraries show students do stumble over this.

3. Metadata: looking at vs searching for. Examined: titles, publication date, abstract. Records without abstract or snippet they get confused and skip (monograph records). Students looking at these fields but not searching them; all keywords all the time.

4. Fast and easy access, generally don't go past first page, or 10 results in Summon (UGs often didn't get through 10). Many students open about being busy, cannot waste time, have to get through things fast. tendency to skip over reserves, storage, books checked out, documents taking too long to download all skipped. Even if they did pursue, not happy about it. But usually they would find things they felt were just as good but easier to access.

No real big surprises. But how can Technical Services staff react to these results? response: teach students to search catalog, give them a booklet to search catalog. Head and supervisor reacted proactively. Summon and link resolver: call out vendors when there's a consistent problem--can be pushy because we know this is what students rely on. Loading ebook records - no keywords, but searchable summaries that will keyword pop. Take more time to see from where Summon pulls information and how we can improve that information. Big change from giving students a booklet.

Great fit to combine UX and Technical Services, TS makes decisions about how our stuff is found online.

Q & A: Were students asked if they had library instruction? No. Sad panda story about bad library instruction (searching keywords as subjects). We need to take care of KB because it degrades as a tool; do we put more resources into this? Do we need more research on effectiveness of link resolver?

Thursday, September 28, 2017

Access 2017 Conference Day 1 Notes Sessions 8-10 #accessYXE

Session 8: The SIMSSA Project: Search as Access to Digital Music Libraries - Emily Hopkins

Notated music - images of scores, and making them machine readable and searchable. Single Interface for Music Score Searching and Analysis. SSHRC Partnership grant 2014-2021, many international partners.

How it works:
Library Digitizes score
Optical Music Recognition
Music encoding initiative
Music search and analysis

1. How do we access scores? Each library has own scores

IIIF gather in one place - over 67,000 documents with more on the way. But just Pictures.

2. How teach computers to read musical scores? XML analog for music is optical music recognition making it machine readable (mp4, or MEI based on TEI that ends itself well to library collections). Commercial options: Sebelius' photoscore designed for standard music scores (not handwritten). Many researchers study chant (Salzinnes Antiphonal) and other less popular forms of writing. A page or two from document to process identifying ground truths to prep for machine learning. Pixel.js allows for correction of machine reading ground truths. Pixelwise Classification - computer determines probability of pixel belonging to a particular category. Does not depend on domain specific heuristics, but document specific ground truths. Interactive classifier trains OMR, Neon.js overlays end result over original image, and then you can create EMI file faithful to original manuscript. Undergoing usability at McGill and Dalhousie.

Crowdsourced OMR correction: Rodan to organize workflow, collaboration with partner organizations and user communities. Now we have a format computer can read. how will search and analysis work?

3. Existing search allows for metadata search. OMR allows search of melodic search and variations and search of degrees of variation from original melody. Instead of laboriously studying melodic lines by hand, can use large scale searching. See: Schubert and Cumming 2015 article in Early Music 43.4 pp. 577-86. Studying patterns can reveal styles of different eras and composers. See also: Desmond, Hopkins, and Howes' 2016 SIMSSA VIII presentation on treatment of different kinds of chords and how changed over time period.

Complete change in scope of research questions that can now be asked. (This is amazing.)

Session 9: Supporting Media-intensive Digital Scholarship: The Development of a Streaming Media Repository at the University of Alberta Libraries - Sean Luyk, Umar Qasim, & Weiwei Shi

ERA A+V media hosting adn streaming service based on Avalon Media System an open source media repository project (Fedora). IR, DataVerse. Many not true streaming but progressive download. Kaltura, Ensemble, ShareStream - focus on short term access for teaching and learning, limited metadata and no control over deposit. More like an individual account like a YouTube channel, cant control copyright process.

Project kickoff Oct 2015 to launch Avalon adn AV service. Soft launch for library staff in May 2016, public beta 9/2016-6/2017, official release 7/17. Upgrade to Avalon 6 would put with Fedora 4, this will expand to self-deposit.

Model: deposit is mediated for now (Digital Media Deposit Request Form open to faculty and grad students) due to size of files and concerns about copyright and right investigation. Deposits done by librarians in batch situation usually. service described on own page, single point of contact is the IR helpdesk.

Technical Overview
Repository is adapted based on Avalon Media System, developed as a Samvera head. How application is structured - Samvera Head based on Rails App. Use small component for transcoding multimedia content, upon ingest creates three version of derivatives for different streaming needs. Then use Wowza. (Need a better look at the Architecture Details slide, too tiny.) Core features of Avalon Media Syste - Deposit, workflow management, search, discover, single sign on (CCID), LTI integration with LMS< complicated access control at item and collection level, embed content to websites, create playlist, display closed captioning, standards based metadata (MODS), persistent URL.

Customization: streamlined batch ingest even more--UI initiated trigger to start ingest process. Customization to metadata/deposit form, additional validation on media objects (language and genre), branding consistent with IR, minor modification to clarify permission and access control, follow development practice test-driven practice where integrate continuous integration tools to review workflow. Challenge to enforce practices because dealing with different level of quality and styles inherited. Robust testing. Added rollbar log monitoring.

See - driven by Blacklight search interface

Currently at 170ish publicly published objects, main use is online education. Some research data management case studies. Course was sharing via google drive, students using their non UoA accounts, kept requesting access; another course in LIS provide streaming access to some content. Challenge - originally were to stream DVDs that were screen-captured, but were warned by legal because of TPMs on DVDs. Challenges with LTI integration, required a lot of development to integrate with Moodle instance. Research data management is really right now just collections making available for research - recordings of UofA concerts, fieldwork of collected Sufi music recording. Long term potential for integration with other repositories.

Media intensive scholarship - investigating as hub for media-oriented DH projects. Not really manipulating or reading media files. OJS journals, Licensed resources.

Q: You're also running a sofia head. Do you think there will be good media handling, or will you continue running two separate heads? A: Goal is eventually one experience. [Ah really works as separate repository.] They are stepping away from Samvera as IR for main repository to explore other options.

Session 10: The Academic Library Commons: Re-imaging Collaborative Learning Spaces to Support the Scholarly Journey from Curiosity to Discovery to Publication - Michael Courtney and Angela Courtney

198 year old library system has evolved to take into consideration evolution of technology in library's role on a university's campus. Indiana University (Bloomington) Libraries 12th largest in North America, 20+ libraries across campus, >9 mill print volumes, 900+ languages. Over 180 years, intense focus on building research collection. Focus has changed. Physical library has only been in few locations--building as monument to the collection, a brutalist box. Once crossed into 21st century, change. Change accelerated in last 17 years. University President very IT driven, made it a mission of Presidency to not just increase and improve but magnify technology infrastructure of campus.

In 2002: first dramatic physical change in main library occurred. went from physical collection for gen ed undergrad curriculum was rebadged as an info commons, print location pushed up or relocated, or pulled out for offsite storage. This broadened the space, room for a lot of technology (workstations, some internal IT support), then that stayed stagnant as an IC for about a decade. After that, again re-envisioned. In one way, infused with more tech, in another, a lot of tech was removed. In 2010, real observation of ways students used spaces - not using a lot of the workstations (hundreds) so 75% removed. Instead, Univ IT became a 50% partner in this space, 24/7 space staffed by UniIT consultants and includes a rich service hub--tech support and support across student research life cycle. research consultants, librarians, library assistants, peer assistant program, student life issue peer consultants, writing tutorial services physical unit in that space. Technology-driven partnerships.

More commons-ing: The Grad Commons--less of a focus on tech and more on space exclusively for grad students to study and do research. Partnership w/grad student and professional organization because wanted and needed dedicated space. Survey, assessment, reflection, pause, repeat. Grad students wanted certain aspects of tech, emphasis on outlets, printer, but great or quiet, dedicated, grad student only space. IU Scholars' Commons. IQ wall - hi-def visualization wall.

Programming partnerships with office hours in library.


Access 2017 Conference Day 1 #accessYXE Notes Sessions 4-7

Session 4: Excavating the 80s: Strategies for Restoring Digital Artifacts From the First Era of Personal Computing - John Durno

"Avoiding technological quicksand" Rothenberg presentation review: hard copy, standards, computer museums, format migration, emulation. Ultimately he argued all but emulation would be of limited utility.

David Bearman's "Reality and Chimeras in the Preservation of Electronic Records" in 1999 D-Lib. OAIS Functional Model. Rothenberg's emulation still at work - see Internet Archive, see code4lib paper (missed cite), Preserving and Emulating Digital Art Objects (Cornell white paper). Usually case-based choosing whatever works, not prescripted.

Case Study 1: AtariWriter - no modern software that can read or convert. but if you can play games on an old platform, someone has probably written an emulator for it. Locate, install, configure an open source emulator. Tracking down old software not usually that difficult, though legality is uncertain. Copies of abandoned-ware for purposes of retrieval and study may fall under fair use...maybe. Reality is that most cases you don't want to read the old documents in that form, reconfiguring every time. Easy to configure to bridge printing output to host operating system, so could print as PDFs. Even if you think format migration is best way to go, sometimes only way to get there is via emulation.

Case 2; Wordstar. Opening one by one and converting them is tedious, Batch converters for wordstar scarce. Can fall back on standards - 7 bit ASCII. Many versions of WordStar varied to allow last bit to change, which makes last letter of each word to be unreadable. PERL--can run batch job. Lose formatting converting to ASCII. Most old files didn't have much formatting especially in correspondence and personal papers.

Case 3: Telidon 699. In 2015, presentation on retrieving images from this. initial success with emulation based approach, but a few more hundred files didn't work that well. Problem files had been developed in earlier version. Consumer grade computers couldn't render natively, so specialized peripherals developed solely to render Telidon images. Ultimate failure of Telidon to gain traction, almost impossible to track down, but almost only way to make intelligible except writing decoder from scratch. Luck: located functional Telidon 699 coder at Spark Radio Museum outside of Vancouver and a technician. You need a moving image format to capture unique qualities. How to transfer images across generations of technology? Recorded off CRT using video camera (similar to way original folks did it). Glenn E. Howarth image of Moon Meat: Vegetarian Nightmare.

Even if agree hardware preservation is hard, can't get away from it because if someone brings you old floppies, realityis that there aren't similar devices to hook up to SCSI cords and haven't been updated since Windows 98. Debate between emulation and migration is a red herring. Different classes require different handling. Requirements of practice can confound and complicate requirements of theory.

Session 5: Opening the DAMS: Open Systems, Open Data, and Open Collaboration with Samvera at UVic - Dean Seeman and Lisa Goddard

Samvera (used to be called Hydra), a toolset that sits on top of Fedora and Fedora 4. (Samvera is pathway to Fedora). Samvera is a complex stack of open source components. Simple way of installing is is Haiku (Hydra in a Box - closest to turnkey pregenerated application to install in a sandbox). UVic branded Haiku as Vault (, not in production but starting to populate. ContentDM was the workhorse but doesn't align well with strategic objectives which includes ability to roll out to faculty to use for their research projects. Example: the Grants menu, the in-kind value faculty can add to grant applications ( Have to have a web client currently, whcih is a real obstacle to broad collaboration. Need system that supports multitenancy--faculty wont just want library asset system, but their own system, display, permissions, etc. Need tool for working with images (IIIF supported by Samvera--especially important with annotating images). Using the Spotlight exhibit platform. Objects in DAMS need to be pulled into exhibits where can do narratives and content. Both use Ruby and similar architecture.

Wanted more control over development of system, roll out features when needed and not wait for long development cycles. Help faculty but use their funding to fund new features instead of having small siloed pieces of software without broader benefit all over campus with no benefit over and above initial project. Need data store for all emerging digital activities. Optimized for academic libraries. Built in versioning software, automated checksums and can run audits on those ot ensure files haven't degenerated in nonobvious ways. Globally unique identifiers makes audits easier. Globally web addressable.

Fedora 4 based on LDP spcification (Linked Data Platform), exposes standards well-documented APIs for create-read-update-delete requests. Portland Common Data Model supported. Linked open data architecture they hope will result in data web native, taking advantage of web protocols. Systems not permanent, but we hope our metadata are. Can plug an external triplestore into Fedora - can see our data in other contexts, see how links or should link to other datasets.

Advantages form cataloging perspective: Why change systems? ContentDM for 8 years, there's got to be a better way. Gives a chance to rethink metadata, get feet wet with linked data practically, and agency.

Rethink Metadata: Analyzed contentDM: 52 collections, 162 unique local fields, 89 of those fields used only once across entire system, inconsistent. Need application profile borrowed from Euopeana, etc. know metadata you have and how have used it in past, make informed decision about designing profile. Beyond properties, let's talk values and consistency - controlled vocabularies, EDF and identifiers with URIs to make linked data happen. Going for interoperability, giving up some specificity for that. A practical apology for linked data: global identifiers > local.

Agency: often we don't get a say in how systems are developed. What principles can we enact? We need people and machines to migrate and create good data. Humans with varying technical skill need to help create good data.

Machine data creation -> Human Judgment Required -> Machine Cosumption + Human Consumption

How enact? Field/property mapping from CONTENTdm fields to Samvera fields. Content migration mapping (values)--used "Vaultify" and tried to assign from controlled vocabularies as much as possible. tries to assign values and syntax automatically. Fine for migration, but how marry humans and machines in terms of metadata completion. Autocomplete, retrieve controlled vocab, retrieve URIs, help standardize syntax and other content when possible. See IMLS grant initiative with University of Houston.

Q&A: researchers happy to have help because not in it to write software and can focus on specific research question, as well as provide in-kinds, and provides opportunity to talk about digital preservation anyway.

Personal note: yes, get on this. Relevant to what we want to do at CSUCI.

Q&A - provenance issue - do these things become part of Library's preservation and intellectual preservation. Onboard to DAMS preferred because can input into metadata and technology. At least if can work with them on metadata models from the beginning, get stuff that's more easily preservable. Yes, UVicLib will take responsibility for those digital materials; whether they'll be indexed remains to be seen.

Session 6: The Way Leads to PushMi-Pullyu, a Lightweight Approach to Managing Content Flow for Repository Preservation at UofA Libraries - Weiwei Shi, Shane Murnaghan, & Matt Barnett

Pushmi-Pullyu from Dr. Doolittle. Ruby app running behind firewall to pull content from Fedora repository in response to user action. Constructs lightweight archival, and pushes into OpenStackSwift.

Stack: ERA (InstRepos) - Ruby on Rails currently based on Sufia 6.2 (a Samvera Head)
Fedora 4 repository: open source system for management and [...]

OpenStack Swift for long term preservation storage. Hihgly available distributed and consistent object store, versioning, internal audit, quarantine, etc. to ensure integrity of preserved objects.

Preservation commitment preservation plan - gold, silver, bronze. need lightweight tools for baseline requirements of commitment.

AIP Archival information Package - consists of content info and preservation description (content, metadata, packaging, and one or more files to capture comprehensive image of object). Need lightweight AIP without much investment while still trying to figure out different targets but still meet baseline requirement. Lightweight AIP diagram: content of objects, metadata, thumbnails and logs contained in bag )in tar file) with manifest, in Swift Object with name value pairs, project, project ID...

PushmiPullyu because want future integration with Archivematica, need to fulfill today's preservation commitment (maybe not capture more than essential info), and IT security restrictions directing access from public facing network of repository to preservation storage poses risk to content preserved. Can hack through public side to attack preserved files.

PMPY Development: goal was to create simplest thing that would work.
YAGNI development (You Aren't Gonna Need It) - no complex logging, no complex reporting, no complex web app to check status, as little custom development as possible
Fedora provides a messaging queue implemented with Java Message Service. PMPY could maybe trigger the preservation event. Creates noise - Fedora's JMS sounds great but cascades of saves and updates, almost 70 messages created, background jobs create characterization, derivatives, DOIs, etc for a single ingest....which message do you want to preserve your item on? if go up to the Rails layer, provides callbacks - anything that happens in mode can trigger code, - use Rails "after_save" callbacks on the Item model.

New problem: items saved multiple times during ingest. Decided really needed priority queues--least recently updated have highest priority for preservation. Many complex queueing solutions support this like RabbitMQ, more (missed one)

Redis sets - guarantee item appears once and only once no matter how many times added. Redis sorted sets (more, ask for slides).

PMPY Development - system design. IR pushes into priority queue, PMPY grabs, pushes into Swift. Software development - used GitHub with ZenHub extension as project planning and management. Agile - sprint planning, sprint backlogs, sprint demos, standups, retrospectives, 4 weeks 4 full time developers. Continuous integration, and Style Consistency (Hound, Rubocop, EditorConfig help check and maintain consistent style for every commit). Working in Ruby can leverage other work, wrote less custom code and more tests. Log monitoring using paid service called Rollbar, useful for monitoring apps.

Next: Gemify AIP creation and Swift ingestion; extend to file-system based ingestion, extend components to other platforms requiring preservation in the Library.

Session 7: "No, We Can't Just Script It" - Danielle Robichaud & Sara Allen
 Migrating archival data. Archival data, archival description history, a case study at the UofWaterloo. They do script a lot of data transformations, not always manually copying, but wanted to focus on factors making that work difficult. Archives /= libraries, archival data /= library daya. All descriptions are original work, it's the first time it's been described within the archival environment. No copy cataloging, need to spend time researching collections, reading about, writing about, get emotionally involved and protective of data. Records are organic and interrelated. No 1:1 relationship between descriptive record and object itself (can be of 1 thing in a folder of things, a box of things, etc.) DOn't usually describe individual photos because usually don't have mandate to do that. Describe some level of hierarchy, depend on researchers to drill down and find individual items. Only describe individual items with funding from donor or decide arbitrarily what is important, or legal reasons, etc.

Complex data, often fragile, not easy to replicate, stored somewhere on archivists hard drive. Cataloging has been working with shared standards for decades, archives standards adoption has been slow and uneven. Internal systems developed sometimes collection specific, sometimes institution specific. 1990 Canadian Rules for Archival Description (RAD) released, mid-90s, database solutions to replace paper records, 1995 - HYPER_RAD hyperlinked easier-to-use version of RAD released; Late 90s-00s, better databases/specialized archival management software. 1998 - EAD brings XML to archives - first time archival data is machine readable. 2001 ICA OSARIS report recommended standardized open source tool for encoding archival finading aids. 2008 July - ICA-ATOM 1.0 Beta released and Nov 2008 ICA-ATOM 1.0.4-beta now has support for RAD. Atherton quote in "Automation and the Dignity of the Archivist" - "Just to mention the words 'computer' or 'automation' in some circles is to invite cold suspicious stares of hostility, making one feel as though he has said something dirty."

Photographic negatives collection - 2 million, one of most heavily used. Recent pilot of Islandora - Waterloo Digital Library. Illustrates challenges especially when comes to available descriptions to work with. Info provided by photographers not created thinking about helping researchers at the other end. Title, date shot, no if and when it appeared in paper, color or B&W. Complte disconnect between archival description available and what researchers are expecting to find. When thinking about migrating digitized images, obvious entry is to duplicate file level record--doesn't work. Confusing and suggests staff screwed up when title doesn't jive with contents. How to describe item level images, how introduce keywords to be meaningful to staff and end users. Designations of whether ran in paper or not, brief description, clearly identify that the info was lifted from the newspaper, not original work by library. Scripting not helpful here in this instance. But batch XML file creation helpful. Archivists often expected to understand libraries, history of being undervalued and expertise questioned may make archivists cagey. Listen to archivists asks for help instead of thinking they haven't heard of OCR.

Access 2017 Conference Day 1 #accessYXE Notes Sessions 1-3

Dean Melissa L. Just, University of Saskatchewan - Opening Remarks (My Notes)
Hot topics from 19 years ago since last in Saskatchewan: classifying the web for search engine Northern Light; planning for sustainable desktop computing; Z39.50. Here "we come together to discuss current and cutting edge challenges and opportunities."

University library worked in collaboration with others on such projects as Saskatchewan History Online (over 100,000). Indigenous Studies portal, or iPortal, with a number of archival entities (more than 33,000 full text resources focusing on First Nation and indigenous peoples. Also the Our Legacy project, extensive collaboration with various archives, libraries, and historical centers and societies. All made possible from small ideas that were germinated at conferences like this one.

Keynote - Dr. Kimberly Christen "The Trouble with Access"
Director of All the Things (so many my typing fingers couldn't keep up). Cultural heritage and digital technology in support of digital repatriation.

Ethics of displaying Native American peoples information online. Her Access 2011 talk focused on tension between idea that openness is a default, versus the diverse sets of relationships to knowledge and sharing that indigenous communities bring to the understanding of information sharing. She emphasized limits fo ideas of openness not only to digital collections, but as to how researchers, imagine and construct an often uncritical understanding of openness as a spectrum from open to closed, instead of nodes in a vast network of type of circulation. Today's talk runs parallel, connecting indigenous sovereignty, ongoing decolonization, and knowledge and relationships of our materials online and in physical form. Openness collections closes relationships; we claim neutrality, but build within power structures; build structures meant to democratize but that colonize. Blind invocation of universal access and celebration of open access in particular that all too often aligns open access with decolonization.

Colonial dispossession in ongoing colonial states: the removal of indigenous bodies from indigenous lands. Respecting and providing physical access to land should acknowledge that we can't separate access to land and access to knowledge.

What does it mean for libraries archives and museums as institutions and those of un in them to provide a space for indigenous peoples free from fear and with respect. What does it mean to be a part of a chain from the future, into the past, where the land is part of these knowledge systems? We can't separate access from digital technologies from the systems of knowing upon which they are built. Proposal: we should trouble access, and disavow political action. Refuse to ignore the elephant in the room, undo structures that maintain active unseeing/unknowing. Sustain tearing down, building anew. We need to grapple with relationships and histories, engage in respectful exchange of knowledge, acknowledge our work in the erasing of such.

Her work since 2011 in the Plateau People's Web Portal. Relationships, structures, and policies that moved this from a project (one time) to a practice (ongoing commitment to engagement with tribal peoples across the university). Began with MOU between University and the indigenous peoples upon whose land the University is built. Built on recognition of indigenous sovereignty. Center for Digital Scholarship and Curation listened to Native American advisory board when they noted they wanted a multi-tribal portal, had to be online so many could access, important to include many types of Native knowledge in content, not just that deemed 'scholarly.' Individual items are important, not the high level cataloging. Tribal decision-making over content that would be posted, and how it would be accessed. Managed in government to government relationship, access according to native protocols.

Tribal administrators chosen by internal governing principles curate content and decide on sharing knowledges. Lists of cultural protocols for access--the content here is specifically public, purposely made public; members are added to protocols based on their relationships. Collaborative communication: no digitization or display of any materials without approval. "Reliance on takedown notices replays the cultural violences of physical dispossession of materials. Once content is online and circulating, the damage is done." 12 categories of knowledge chosen and updated as needed, individual materials added by each tribe, so no default to subject headings (LoC) because they have been violent. 2015 Cataloging and Classification Quarterly has good articles on this.

This process of collaborative curation, call digital return (not just digitizing and returning so we hold copies, but embedding indigenous protocol for description, use, and access at all levels - MOUs, design decisions, categorization, digitization, etc.).

Museum refusal to return Salmon Chief's knife; offer to send high resolution image. There are very real limits for digital return, there are time that physical materials must be held, touched, used, passed on.

Museum record, institutional metadata standard, versus community curation adding own record with additional metadata in another tab on same page--community record has a new title not 'root gathering basket' but 'duck basket', and defines attribution not to non-native collector but to tribe, and narrations by named individuals and members of community. Because allows any file format, additional layers of knowledge. Each record can have its own protocol for access. Narrations bring photos to life, asked to sit with the unease of colonial practices.

It means we also need to provide tools to allow this work to happen elsewhere. Free and open source, built with idea of ethical ways to record info in face of sensitivity Mukurtu allows for multiple layers of community information and attribution (multiple records for any file, multiple layers of attribution), customizable to size and needs of community, and also to be a viable software for non-indigenous communities. No one profile except common ground of growing from local indigenous needs; Mukurtu focuses on growing relationships, reciprocity, intergenerational knowledge sharing and exchange. Literally means dilly-bag, sacred items kept in bag by elders, youngsters needed to ask permission before accessing. Groundedness in people, place, and ancestors. Based in local cnceptions of access. but it is a tool, a platform.

Mukurtu - 3 Cs: communities (the who of Mukurtu. Who are stakeholders and contributors; nobody goes unnamed). C Cultural protocols: how. jow is content and metadata shared, what protocol are there for sharing/access? Granular. C: Categories: the what of Mukurtu, what are most important to community? These elements are core architecture. Encourages these connections. Much like free trade labels, TK labels allow us to make ethical decisions about use, reuse and circulation of information. (Strike me as similar to CC labels). See Sq'ewlets people, tribe of Sto:lo, website who guide users via defining TK labels, and secret sacred label replaces images of sacred sites and human remains. Provide information about how they understand the viewing of human remains, and understanding of why they choose to circulate or not those images that may re-enact violence or violate sacred strictures.

LoC updates a record with rights and access field, and adding the Passamaquaddy people's own language and description for the Passamaquoddy War Song.

not bad colonial museums vs good postcolonial ones. intent to illustrate the levels through and at which we need ot be vigilant; the history of technological advances in cataloging, collecting, and display cannot be divorced from violence of dispossession, legacy of looking and taking. Not all info is intended to be digitized, accessed, and preserved. Material semiotics: always situated someplace, and not no place. (Quoting feminist archivist Donna Haraway.)

Foreground agreements with indigenous peoples. We can add steps to workflows that account for more voices, to combat not-seeing and seeing-through. We can work with communities. We can work to undo legal fictions of ownership by providing the alternative licenses and strategies, or ignoring the ones that exist.


The concern of "rewriting history" if we don't save and provide access to all information.
Notion of transgression, of openness, tries to erase other systems. Acting ethically doesn't mean we act the same in all situations. There is no level playing field (Leanne Simpson (sp?)). Acting ethically in terms of indigenous collections requires different perspective and tools and relationships, versus medical records from non-native Canadian hospital. Ethics is not a blanket we can put over everything. We are so afraid of legal structures because we are programmed to be. We are policing ourselves at the expense of undoing hegemonic practices.

License in perpetuity to maintain these materials. As state institutions it is our responsibility, it's expensive to maintain -- servers, long term work, none of us can predict what it will cost to store this. As universities we must make commitment, we pay tons for football stadiums. Structural level and getting things in writing is important. We need MOUs, workflows, "tip to tail."

Question: In the West, we are conditioned into controlled vocabularies. Answer: controlled vocabularies are intended to provide order. We are obsessed with order and organizing. The myth of organization. LIS schools are a problem! Perpetuate with standing courses that perpetuate standing to side, being objective, not about context and relationships.

To read: Donna Haraway's Staying with the Trouble: Making Kin in the Chthulucene

Session 1: Visualizing Province-Wide Public Library Transaction Data with Elastic Stack - Dale Storie & Scott Murray from SILS

Serves 11 agencies and over 300 branches in the province. Migrated to new ILS (Polaris) in 2015, so reporting important because libraries want to know where materials are going, who is requesting what. Data-driven decision making. Uses SQL reporting as backbone, get reports as excel spreadsheets, boring. Tech news sites were turning boring firewall logs into cool visualizations and dashboards--what about doing it with ILS reporting data?

Elastic Stack: 3 free open source:
ElasticSearch (data storage and search index)
Logstash - gathers data from sources and sends data into destinations
Kibana - web app for searching and visualizing. Licensed under Apache 2.0
(AirBNB, Yelp, NYT, NYPL).

Give permission only to staff who should have access through open directory, same login as use for usual work.

Circ transactions in Polaris

1. Discover
- search the dataset
- preview results

options for search (full text, exact value, Lucene query syntax)

2. Visualize
- standard visualization options (charts, heat maps, stacked area graph, awesome mapping facilities, etc)

3. Dashboard
- multiple visualizations brought together in a dashboard
Dashboards vs traditional reporting - easier for nontechnical staff to grasp implications of data and trends

Also used as an alerting system, monitors API use
Code at

Session 2: Advocating for Digital Privacy The Centrality of Public Libraries in a Uniquely 21st Century Struggle - Jonathon Hodge 

Important to ask why we are doing something. Why? Why advocate? Why care about privacy?

Thomas Paine's Common Sense is most circulated book in American history Published anonymously because would have been thrown in jail. More recently, Thelma and her personal history outed by AOL searches; AOL generated machine number but assigned same number to same person every time. That information in hands of insurance provider, police officer, hirer... Ed, software and security consultant had a crisis conscience about aggregated data (Snowden). Today, any one of our community members could be one of those people.

Without technological protections to protect privacy, the actions of forebears and contemporaries would not have had the impact on our society. Who will stand up, if not us? ACLU,, other nonprofits. Privacy is complicated, but so is tax system. People can navigate that because of layers of intermediaries (lawyers, accountants). We need infomediaries--libraries and librarians can be those.

Tor browser - digital privacy initiative - can we put together a plan that is user-focused.

Plans to make available to other libraries and librarians? Yes, working with group out of University of Montreal for Cybersecurity 101, Michael Joyce. Intention was public file drop. Email for pdfs and Word docs, and they'll be building a portal, hope to be live before Halloween. Response of community has been almost universally positive. The rights that we enjoyed in the analog age are things we have to take active steps for in digital age. Confidence and competence in this are core to public library mandate.

Session #3: Don't Be So Sensitive - A Data Security Journey - Hannah Rainey and Emily Lynema

Context: Hannah is a Fellow at NCSU (my old stomping ground!), focus on data literacy at land-grant institution. Effective Stewardship of Library Data proposed as project by Emily Lynema

Data and cybersecurity of primary concern to many institutions - business, governments, institutions of higher ed, individuals, new data breaches every week. Campus prioritized cybersecurity - legal and regulatory landscape. Federal and state statutes--FERPA, HIPAA are American examples. At NCSU regulations are interpreted through Data Sensitivity Framework, assigns sensitivity levels to data. Purple, Red, Yellow, Green. Scaffolding for project. Central office of IT did much initial legwork. Privacy /= security. We used to purge records, now we have room logs, system logs, data collection beyond circulation. Now in face of big data and learning analytics to prove worth to greater institutions.

1. What data do we have, 2. where is that data. Internal data audit, assess practices based on institutional framework. Difficulty finding practical approaches to protecting patron data. Helpful info on the data audit or DAF Data Audit Framework which provides organizations with means to locate, identify, assess, and describe--easily applied to security audit. In hindsight, common sense, but at first road map is very helpful.

Audit plan: assigned self as auditor, scope of interest was sensitive data. Reviewed internal documentation, scheduled interviews with administrators of those departments. Created training, made recommendations for policy updates. Interviews more effective information gathering due to institutional culture, but started hourlong interviews with overview of goals and project, asked to share thoughts and begin conversation where most comfortable. Loosely structured interviews expanded conversation--for instance, photos and video records. Practices for storing, sharing, preserving data varied across departments. Departments with student employees maintain inconsistent lists, for example.

Inventory fields - source, department, OIT category, elements, sensitivity, storage, retention, use, access, notes (not exhaustive, based on scope). Channlenges - definition and scope of data source (source - ILS, individual spreadsheet, difficult to reduce individual data elements, external sources, vendor-hosted data). Describing data sources and elements not as clear as hoped, amount and variety of data overwhelming. initial scope and framework very important. Assigned sensitivity levels base don internal framework. Majority of data decided to be yellow, or mildly sensitive, and identified red data instances institution did not need and removed them. Library data (most sacred) not addressed in Framework - most sacred user-related data. Interesting conversations about what counts as library data, and which of that data should be protected.

Next, training for staff on how to handle sensitive data using relatable examples and Framework. In process of recommending policy updates reflecting commitment to privacy and security. Models for ongoing maintenance of inventory and data governance being developed.

Individuals were insecure about security practices. Important to be compliant with campus guidelines; contacted head of compliance to let them know about the project. Better positioned to make decisions about data collection in the future.

Questions: did you classify working documents moving between staff? Yes, users included staff and outside of library transactions. A: Most of those are classified by internal framework. Questioner Note: we talk about user privacy but apply different rules to our own work. Q: Any pushback on removing information from collection. A: More about knowing what data we're collecting so we can store it properly and cutting sensitive data we don't use for anything.