Access 2017 Conference Day 2 Notes Sessions 4-7 #accessYXE
Session 4: IIIF You Can Dream and Not Make Dreams Your Master - IIIF In The Real World - Peter Binkley
IIIF (Triple-I Eff) - provides above the tiled image level, a whole presentation level - a book reader like Internet Archive, but manifest itself is a lovely structured package of info about a digitzed object that can be used for different purposes. Useful beyond the eyecandy of slides.
See http://iiif.io - learn 90% of what you want to know here
About 100 participating institutions in IIIF community like Internet Archive, Getty. Also a github/iiif repository and worth looking in there called awesome-iiif which is the community-maintained list of resources and demos and links, etc.
Goals of IIIF - provide unprecedented level of uniform and rich access to image-based resources for scholars; define APIs that support interoperability between repositories; provides a world-class user experience in viewing, comparing, manipulating, and annotating images. Annotating - very exciting! Use open annotation specification to allow users to create annotations that you let them view.
A family of APIs: Image API does the tiling - google maps effect of infinitely deep zoom into image via detailed tiles cut from image so not downloading huge image all at once. Presentation API - most interesting and impactful - metadata package with structural metadata about items you're presenting, can link to descriptive metadata, can only embed as name value parents as labels - you shouldnt be doing structured metadata here. Search & Authentication APIs are most recent additions. Search intended to allow you to drive a full text search of document based on a server-side index. IIIF client can run queries against your items. Authentication API for those required to authenticate access to images and tiles, example from Canadiana. All of these APIs based on json-ld. Most promising - every time presenting book through IIIF you are creating linked data. URIs available for use beyond IIIF system.
Tiled image layer - depends on a specific URL pattern - build URL with 5 elements appended to identifier of image you want (region [cropped tile area], size[1024], rotation [0], quality [default], format [jpg]) where [] are examples). As it creates the tile you've asked for, applies operations in order listed. Not often making these by hand, but by javascript client running in browser. Example of Princeton and their illuminated manuscript shown in demo.
Need to put processing power behind it to meet requests. All of the info a IIIF client needs to determine what it should ask for is encapsulated in a file called info.json - have one for every image in collection, sometimes created on the fly. It tells the application what the options are. Client knows what server can do, client needs to make requests server can actually serve based on the menu of options. This is linked data. Json-ld: Linked data expressed in JSON - subject, predicate, object. JSON reasonably concise but is linked data, full RDF, this is a good thing.
3 levels of compliance - 2 has more possibilities and features it supports. Level 0 specifies a level of functionality that can be satisfied by static files all pregenerated, and client will only ask for those limited images to present to user - low level of technical commitment to get on the IIIF bandwagon. He has written Jekyll plug-in to let you throw hi-res images in and get a level 0 IIIF presentation for static log and it pretty much works.
Presentation from Book of Hours of Geneve, Bibliotheque de Geneve - zoom, side by side page view, click on info and see embedded metadata of simple name value pairs. If want to compare, the Mirador (sp?) client allows to open two side-by-side.
All driven by shared Canvas model - specification allows to describe each image as a canvas onto which we paint other things. There might be text annotations we want to overlay or highlight, count as annotations going onto a canvas [Personal note: Gah! This might work for Island stuff]. Miniatures cut from medieval manuscripts and at other libraries can be overlaid digitally and restore manuscript by painting into coordinates. Once you go IIIF a third party can pull tiles and publish.
Responsibility to maintain URIs must be useful as basis of open annotations others will make, which may be on other servers; if reorganize and break URIs, is disservice to scholars, so we must treat it as linked data.
Session 5: Cantaloupes and Canvases: Adopting IIIF Specifications - Sascha Adler
Why is IIIF at Canadiana (pipeline from digitization to preservation to access, repository, provide access, some collections free, some subscription based, specialize in early Canadian history). Why adopt it?
Content Server (COS) serves derivative images and direct file access (full pdfs). Everything else goes through CAP which does everything - presentation, search, user and institution authentication, admin tools, CMS, administrative tools, etc. Tough to maintain and update. Move from monolith to modules - smaller interconnected services to improve codebase. IIIF helps because multiple services will be publicly accessible, would be nice to have open internationally adopted standards. Replace or modules with community built software so maintain less code. Also want to provide content with public APIs, right now can only access through web interface. Also helps think about new features - annotations are very powerful, they would love transcriptions for written collections where OCR doesn't work, search term highlighting on top of images, and the IIIF annotation spec allows us to think about this. Right now they don't do tiled image rendering, but IIIF standards adopted and put in a viewer and they'd be good to go.
6 month progress so far:
Cantaloupe - derivative image generation medusa-project.github.io/cantaloupe/
Written in java, handles jpegs, tiffs, pdfs, jpeg2000, easy to configure but watch out for properties file and
see config and Docker image at github.com/c7a/cihm-cantaloupe
Direct file access spun out to separate service, CAP image viewer rewritten to make requests with IIIF
converting series to collections, and items to manifests for IIIF - output can be plugged into IIIF compatible viewers.
Authentication: cantaloupe is configured to only handle authorized requests, easy to configure. Full integration of IIIF authentication API will take some more work. Given a user and an image, what rights does request have (doable but will take time).
Next - Authentication API, convert internal presentation output to IIIF, Evaluate IIIF viewers, internal data models for annotations (explore and develop models)
see github.com/c7a for their work
@SaschaAdler
Session 6: An Open Source Approach to Proactively Manage EZProxy - Jason Zou
Cancelled but Slides available here - Session author unavailable
Abstract: It is a daunting task to manage EZProxy without any additional tools. Especially, when various vendors notify you that they have to cut off your institutions' access, because of massive or illegal downloading. Where to start? The presenter will share his experiences of using an open source approach to proactively monitor and manage EZproxy.
Session 7: Kenny: A Web-Based Tool for EZProxy Authentication Management
Web based login management tool. Kenny? Design decisions are an important part of any project: what server platform, what app framework, what license model. but NAME. Manage Log ins... loggins...Kenny. Loggins.
Problem: Academic libraries are constant target for academic theft. User accounts compromised, probing from servers and bots. Occasional need to override a user's level of access (demoting from current student to alumni status, for ex). We run EZProxy to manage this. EZProxy and other servers are under constant attack by big organizations, not little users, to harvest data we license access to. They group database by access status, but occasional need to override user's access.
But EZProxy! EZP does privide some ways to reject IPs and override access groups folks are assigned to. RejectIP directive, a user.txt file (baduser:deny). Disadvantages - to make changes, those in charge need SSH access to the EXP server and permission to write those files. Changes in Config.txt require EZP daemon to be restarted. Root privileges? Yeah right, not giving that out. Also no record of who blocked what user on what day and why.
If others are working with patrons to reinstate access, they would need to kick support ticket to IT to fix, extra delay and a disservice. Kenny as solution. Web based tool to allow any authorized member manage blocking, or overriding the assigned EZproxy groups, for patron accounts and IP ranges. Doesnt run on EZP server, doesn't require staff have access to server with SSH or root access. Uses standard authentication method supported by EZP, so no reboots of EZP needed. Designed to capture complete history of who blocked, when why and by whom. If configured properly, Kenny can terminate sessions in the EZP server (blocking user only half problem; if an open connection, even blocking user doesn't impact as long as session is open)
EZP supports 2 custom authentication methods, CGI and External, Kenny makes use of external (after CAS or LDAP and acts as filter).
Default list is all currently blocked users. can import text file of blocked users into Kenny.
Great for those who deal with compromised accounts in EZProxy.
PHP application, developed using Laravel 5.4 framework. Bootstrap, with support for Bootswatch themes. Session management accomplished via HTTP requests to standard EXP web interface. Open source, coming soon to GitHub
IIIF (Triple-I Eff) - provides above the tiled image level, a whole presentation level - a book reader like Internet Archive, but manifest itself is a lovely structured package of info about a digitzed object that can be used for different purposes. Useful beyond the eyecandy of slides.
See http://iiif.io - learn 90% of what you want to know here
About 100 participating institutions in IIIF community like Internet Archive, Getty. Also a github/iiif repository and worth looking in there called awesome-iiif which is the community-maintained list of resources and demos and links, etc.
Goals of IIIF - provide unprecedented level of uniform and rich access to image-based resources for scholars; define APIs that support interoperability between repositories; provides a world-class user experience in viewing, comparing, manipulating, and annotating images. Annotating - very exciting! Use open annotation specification to allow users to create annotations that you let them view.
A family of APIs: Image API does the tiling - google maps effect of infinitely deep zoom into image via detailed tiles cut from image so not downloading huge image all at once. Presentation API - most interesting and impactful - metadata package with structural metadata about items you're presenting, can link to descriptive metadata, can only embed as name value parents as labels - you shouldnt be doing structured metadata here. Search & Authentication APIs are most recent additions. Search intended to allow you to drive a full text search of document based on a server-side index. IIIF client can run queries against your items. Authentication API for those required to authenticate access to images and tiles, example from Canadiana. All of these APIs based on json-ld. Most promising - every time presenting book through IIIF you are creating linked data. URIs available for use beyond IIIF system.
Tiled image layer - depends on a specific URL pattern - build URL with 5 elements appended to identifier of image you want (region [cropped tile area], size[1024], rotation [0], quality [default], format [jpg]) where [] are examples). As it creates the tile you've asked for, applies operations in order listed. Not often making these by hand, but by javascript client running in browser. Example of Princeton and their illuminated manuscript shown in demo.
Need to put processing power behind it to meet requests. All of the info a IIIF client needs to determine what it should ask for is encapsulated in a file called info.json - have one for every image in collection, sometimes created on the fly. It tells the application what the options are. Client knows what server can do, client needs to make requests server can actually serve based on the menu of options. This is linked data. Json-ld: Linked data expressed in JSON - subject, predicate, object. JSON reasonably concise but is linked data, full RDF, this is a good thing.
3 levels of compliance - 2 has more possibilities and features it supports. Level 0 specifies a level of functionality that can be satisfied by static files all pregenerated, and client will only ask for those limited images to present to user - low level of technical commitment to get on the IIIF bandwagon. He has written Jekyll plug-in to let you throw hi-res images in and get a level 0 IIIF presentation for static log and it pretty much works.
Presentation from Book of Hours of Geneve, Bibliotheque de Geneve - zoom, side by side page view, click on info and see embedded metadata of simple name value pairs. If want to compare, the Mirador (sp?) client allows to open two side-by-side.
All driven by shared Canvas model - specification allows to describe each image as a canvas onto which we paint other things. There might be text annotations we want to overlay or highlight, count as annotations going onto a canvas [Personal note: Gah! This might work for Island stuff]. Miniatures cut from medieval manuscripts and at other libraries can be overlaid digitally and restore manuscript by painting into coordinates. Once you go IIIF a third party can pull tiles and publish.
Responsibility to maintain URIs must be useful as basis of open annotations others will make, which may be on other servers; if reorganize and break URIs, is disservice to scholars, so we must treat it as linked data.
Session 5: Cantaloupes and Canvases: Adopting IIIF Specifications - Sascha Adler
Why is IIIF at Canadiana (pipeline from digitization to preservation to access, repository, provide access, some collections free, some subscription based, specialize in early Canadian history). Why adopt it?
Content Server (COS) serves derivative images and direct file access (full pdfs). Everything else goes through CAP which does everything - presentation, search, user and institution authentication, admin tools, CMS, administrative tools, etc. Tough to maintain and update. Move from monolith to modules - smaller interconnected services to improve codebase. IIIF helps because multiple services will be publicly accessible, would be nice to have open internationally adopted standards. Replace or modules with community built software so maintain less code. Also want to provide content with public APIs, right now can only access through web interface. Also helps think about new features - annotations are very powerful, they would love transcriptions for written collections where OCR doesn't work, search term highlighting on top of images, and the IIIF annotation spec allows us to think about this. Right now they don't do tiled image rendering, but IIIF standards adopted and put in a viewer and they'd be good to go.
6 month progress so far:
Cantaloupe - derivative image generation medusa-project.github.io/cantaloupe/
Written in java, handles jpegs, tiffs, pdfs, jpeg2000, easy to configure but watch out for properties file and
see config and Docker image at github.com/c7a/cihm-cantaloupe
Direct file access spun out to separate service, CAP image viewer rewritten to make requests with IIIF
converting series to collections, and items to manifests for IIIF - output can be plugged into IIIF compatible viewers.
Authentication: cantaloupe is configured to only handle authorized requests, easy to configure. Full integration of IIIF authentication API will take some more work. Given a user and an image, what rights does request have (doable but will take time).
Next - Authentication API, convert internal presentation output to IIIF, Evaluate IIIF viewers, internal data models for annotations (explore and develop models)
see github.com/c7a for their work
@SaschaAdler
Session 6: An Open Source Approach to Proactively Manage EZProxy - Jason Zou
Cancelled but Slides available here - Session author unavailable
Abstract: It is a daunting task to manage EZProxy without any additional tools. Especially, when various vendors notify you that they have to cut off your institutions' access, because of massive or illegal downloading. Where to start? The presenter will share his experiences of using an open source approach to proactively monitor and manage EZproxy.
Session 7: Kenny: A Web-Based Tool for EZProxy Authentication Management
Web based login management tool. Kenny? Design decisions are an important part of any project: what server platform, what app framework, what license model. but NAME. Manage Log ins... loggins...Kenny. Loggins.
Problem: Academic libraries are constant target for academic theft. User accounts compromised, probing from servers and bots. Occasional need to override a user's level of access (demoting from current student to alumni status, for ex). We run EZProxy to manage this. EZProxy and other servers are under constant attack by big organizations, not little users, to harvest data we license access to. They group database by access status, but occasional need to override user's access.
But EZProxy! EZP does privide some ways to reject IPs and override access groups folks are assigned to. RejectIP directive, a user.txt file (baduser:deny). Disadvantages - to make changes, those in charge need SSH access to the EXP server and permission to write those files. Changes in Config.txt require EZP daemon to be restarted. Root privileges? Yeah right, not giving that out. Also no record of who blocked what user on what day and why.
If others are working with patrons to reinstate access, they would need to kick support ticket to IT to fix, extra delay and a disservice. Kenny as solution. Web based tool to allow any authorized member manage blocking, or overriding the assigned EZproxy groups, for patron accounts and IP ranges. Doesnt run on EZP server, doesn't require staff have access to server with SSH or root access. Uses standard authentication method supported by EZP, so no reboots of EZP needed. Designed to capture complete history of who blocked, when why and by whom. If configured properly, Kenny can terminate sessions in the EZP server (blocking user only half problem; if an open connection, even blocking user doesn't impact as long as session is open)
EZP supports 2 custom authentication methods, CGI and External, Kenny makes use of external (after CAS or LDAP and acts as filter).
Default list is all currently blocked users. can import text file of blocked users into Kenny.
Great for those who deal with compromised accounts in EZProxy.
PHP application, developed using Laravel 5.4 framework. Bootstrap, with support for Bootswatch themes. Session management accomplished via HTTP requests to standard EXP web interface. Open source, coming soon to GitHub
Comments