Archive for the 'OAI-PMH' Category

ScientificCommons.org: Access to Over 13 Million Digital Documents

Posted in E-Prints, OAI-PMH, Open Access, Scholarly Communication, Search Engines on January 19th, 2007

ScientificCommons.org is an initiative of the Institute for Media and Communications Management at the University of St. Gallen. It indexes both metadata and full-text from global digital repositories. It uses OAI-PMH to identify relevant documents. The full-text documents are in PDF, PowerPoint, RTF, Microsoft Word, and Postscript formats. After being retrieved from their original repository, the documents are cached locally at ScientificCommons.org. It has indexed about 13 million documents from over 800 repositories.

Here are some additional features from the About ScientificCommons.org page:

Identification of authors across institutions and archives: ScientificCommons.org identifies authors and assigns them their scientific publications across various archives. Additionally the social relations between the authors will be extracted and displayed. . . .

Semantic combination of scientific information: ScientificCommons.org structures and combines the scientific data to knowledge areas with Ontology’s. Lexical and statistical methods are used to identify, extract and analyze keywords. Based on this processes ScientificCommons.org classifies the scientific data and uses it e.g. for navigational and weighting purposes.

Personalization services: ScientificCommons.org offers the researchers the possibilities to inform themselves about new publications via our RSS Feed service. They can customize the RSS Feed to a special discipline or even to personalized list of keywords. Furthermore ScientificCommons.org will provide an upload service. Every researcher can upload his publication directly to ScientificCommons.org and assign already existing publications at ScientificCommons.org to his own researcher profile.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

DLF/NSDL OAI Best Practices Wiki

Posted in Metadata, OAI-PMH, Open Access on January 17th, 2007

The Digital Library Federation and NSDL OAI and Shareable Metadata Best Practices Working Group’s OAI Best Practices Wiki has a number of resources relevant to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and related metadata issues.

The Tools and Strategies for Using and Enhancing/Extending the OAI Protocol section is of particular interest. It includes information about OAI-PMH data provider and service provider registries, software solutions and packages, and static repositories and gateways; metadata management and added value tools as well as OAI and character validation tools; and using SRU/W, collection description schema, and NSDL safe transforms.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

Is OAI-PMH Too Labor-Intensive?

Posted in Metadata, OAI-PMH, Open Access on January 9th, 2007

OAI-PMH permits metadata harvesting from disciplinary archives, institutional repositories, and other digital archives. This allows the creation of specialized search services using this harvested metadata. OAI-PMH is a key technology for the open access movement, but does it require too much human intervention?

An interesting message on JISC-REPOSITORIES by Santy Chumbe, Technical Officer of the PerX project, suggests that it may. He says:

We have learned that in despite of its relative simplicity, an OAI-PMH service can be harder to implement and maintain than expected. We have spent a lot of effort harvesting, normalising and maintaining metadata obtained from OAI data providers. In particular the issue of metadata quality is an important factor here. A summary of our experiences dealing with OAI-PMH can be found at http://eprints.rclis.org/archive/00006394. . . . A final report outlining the maintenance issues involved in the project is in progress but the experience gained suggests that successful ongoing maintenance of OAI targets would require a mixture of automated and manual approaches and that the level of ongoing maintenance is high.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

STARGATE Final Report and Tools

Posted in E-Journals, OAI-PMH, Open Access, Publishing, Scholarly Communication on December 7th, 2006

The STARGATE project has issued its final report. Here’s a brief summary of the project from the Executive Summary:

STARGATE (Static Repository Gateway and Toolkit) was funded by the Joint Information Systems Committee (JISC) and is intended to demonstrate the ease of use of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Static Repository technology, and the potential benefits offered to publishers in making their metadata available in this way This technology offers a simpler method of participating in many information discovery services than creating fully-fledged OAI-compliant repositories. It does this by allowing the infrastructure and technical support required to participate in OAI-based services to be shifted from the data provider (the journal) to a third party and allows a single third party gateway provider to provide intermediation for many data providers (journals).

To support the its work, the project developed tools and supporting documentation, which can be found below:

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

OAI’s Object Reuse and Exchange Initiative

Posted in Disciplinary Archives, Emerging Technologies, Institutional Repositories, OAI-ORE, OAI-PMH, Open Access, Scholarly Communication on October 14th, 2006

The Open Archives Initiative has announced its Object Reuse and Exchange (ORE) initiative:

Object Reuse and Exchange (ORE) will develop specifications that allow distributed repositories to exchange information about their constituent digital objects. These specifications will include approaches for representing digital objects and repository services that facilitate access and ingest of these representations. The specifications will enable a new generation of cross-repository services that leverage the intrinsic value of digital objects beyond the borders of hosting repositories. . . . its real importance lies in the potential for these distributed repositories and their contained objects to act as the foundation of a new digitally-based scholarly communication framework. Such a framework would permit fluid reuse, refactoring, and aggregation of scholarly digital objects and their constituent parts—including text, images, data, and software. This framework would include new forms of citation, allow the creation of virtual collections of objects regardless of their location, and facilitate new workflows that add value to scholarly objects by distributed registration, certification, peer review, and preservation services. Although scholarly communication is the motivating application, we imagine that the specifications developed by ORE may extend to other domains.

OAI-ORE is being funded my the Andrew W. Mellon Foundation for a two-year period.

Presentations from the Augmenting Interoperability across Scholarly Repositories meeting are a good source of further information about the thinking behind the initiative as is the "Pathways: Augmenting Interoperability across Scholarly Repositories" preprint.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

Will You Only Harvest Some?

Posted in Disciplinary Archives, E-Prints, Institutional Repositories, OAI-PMH, Open Access on May 26th, 2005

The Digital Library for Information Science and Technology has announced DL-Harvest, an OAI-PMH service provider that harvests and makes searchable metadata about information science materials from the following archives and repositories:

  • ALIA e-prints
  • arXiv
  • Caltech Library System Papers and Publications
  • DLIST
  • Documentation Research and Training Centre
  • DSpace at UNC SILS
  • E-LIS
  • Metadata of LIS Journals
  • OCLC Research Publications
  • OpenMED@NIC
  • WWW Conferences Archive

DL-Harvest is a much needed, innovative discipline-based search service. Big kudos to all involved.

DLIST also just announced the formation of an advisory board.

The following musings, inspired by the DL-Harvest announcement, are not intended to detract from the fine work that DLIST is doing or from the very welcome addition of DL-Harvest to their service offerings.

Discipline-focused metadata can be relatively easily harvested from OAI-PHM-compliant systems that are organized along disciplinary lines (e.g., the entire archive/repository is discipline-based or an organized subset is discipline-based). No doubt these are very rich, primary veins of discipline-specific information, but how about the smaller veins and nuggets that are hard to identify and harvest because they are in systems or subsets that focus on another discipline?

Here’s an example. An economist, who is not part of a research center or other group that might have its own archive, writes extensively about the economics of the scholarly publishing business. This individual’s papers end up in the economics department section of his or her institutional repository and in EconWPA. They are highly relevant to librarians and information scientists, but will their metadata records be harvested for use in services like DL-Harvest using OAI-PMH since they are in the wrong conceptual bins (e.g., set in the case of the IR)?

Coleman et al. point to one solution in their intriguing "Integration of Non-OAI Resources for Federated Searching in DLIST, an Eprints Repository" paper. But (lots of hand waving here), if using automatic metadata extraction was an easy and simple way to supplement conventional OAI-PMH harvesting, the bottom line question is: how good is good enough? In other words, what’s an acceptable level of accuracy for the automatic metadata extraction? (I won’t even bring up the dreaded "controlled vocabulary" notion.)

No doubt this problem falls under the 80/20 Rule, and the 20 is most likely in the low hanging fruit OAI-PMH-wise, but wouldn’t it be nice to have more fruit?

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

More on OhioLINK’s Digital Resource Commons

Posted in Institutional Repositories, OAI-PMH, Open Access on May 6th, 2005

David F. Kohl has self-archived a PowerPoint presentation about the DRC at E-LIS. It’s called "Cooperating Beyond the ‘Buying Club’: Digital Resource Commons (DRC): Making the Impossible Possible in Ohio."

To quote from the abstract:

Each institution can ‘brand’ itself in the system and may host a discrete and customized interface to all of its content. To the end user it will appear as an institutional resource as if it were hosted on your own servers. There will also be a collective OhioLINK level branding and ability for searches to retrieve across the institutional collections. . . . You will have complete control of your own content and how it is accessed. Multi-tiered security levels will allow your content to be shared only to the extent desired. . . .

Alternatively content can be restricted to an individual department, to an institution, or to the OhioLINK membership. Each institution can set its own policies governing the content in its repositories. Likewise custom workflows can be established to make the most of the personnel involved in each project and expedite the content creation and capture process. The service will include robust and flexible cataloging tools to aid in the creation of records that can be searched and browsed effectively by all types of users. Catalog records can be exported in international standard XML formats such as the Open Archives Initiative Protocol for Metadata Harvesting. Through OhioLINK’s unique collaboration with the Ohio Supercomputer Center your content is stored on enterprise class servers and storage networks.. . . A huge storage area network allows virtually unlimited storage space on our disks. . . . Programming or system administration skills and experience are not required. The system is flexible and adaptable and provides services superior to ‘DSpace’ and ‘ContentDM’ without the associated costs.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

OhioLINK’s Digital Resource Commons

Posted in Institutional Repositories, OAI-PMH, Open Access on May 6th, 2005

Peter Murray, Assistant Director of Multimedia Systems at OhioLINK recently posted a job announcement on LITA-L (I’d link, but given the way ALA safeguards access to its lists, it’s simply impossible) that brought to my attention a bold OhioLink project called the Digital Resource Commons, which is part of an even bolder project called the Ohio Digital Commons for Education. The quote from the job ad below describes the Digital Resource Commons. An earlier part of the ad indicates that Fedora will be used as the DRC’s platform.

OhioLINK’s Digital Resource Commons (DRC) is an Ohio Board of Regents-funded project to create a federated repository service that ingests, preserves, presents, and mediates administration of the educational and research materials of participating institutions. With the capability to store and deliver a virtually unlimited variety of digital file types and formats (including text, data sets, image, audio, video, streaming video, multimedia presentations, animations, etc.) the DRC is positioned to capture digital content from student and faculty researchers as it is produced and return it to users of the DRC upon request. The DRC offers wide and flexible control to member institutions and the communities within institution to define how content is added, preserved, and displayed to repository users. With federated community administration features, lead contacts at member institutions can create communities and delegate up to a complete subset of their privileges within the system to the editors/moderators of those new communities. The ability to scope and brand content to a particular community and institution is offered while retaining the ability to search for content across the entire repository. As both an Open Archives Initiative Data Provider and Service Provider, the DRC is positioned to become the premier point for the discovery of knowledge by and about Ohio’s scholars. In conjunction with the other parts of the Ohio Board of Regents grant funding, the DRC is one piece of a larger effort to build the Ohio Digital Commons for Education—a powerful vision for the future of learning and research in the state of Ohio.

The quote below from the DRC Web site describes the Ohio Digital Commons for Education.

The Digital Resource Commons is one of three projects funded by an Ohio Board of Regents Technology Initiatives grant collectively called the Ohio Digital Commons for Education (ODCE). The three components—this resource repository, the state-wide licensing and development of course management systems (WebCT and Blackboard), and a common access control mechanism (Shibboleth)—combine to offer a powerful vision for learning and research for the state of Ohio.

Impressive. As Daniel Hudson Burnham said: "Make no little plans; they have no magic to stir men’s blood and probably themselves will not be realized."

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.

New OAI-PMH Guidelines

Posted in Copyright, Metadata, OAI-PMH on May 5th, 2005

The Open Archives Initiative has issued Conveying Rights Expressions about Metadata in the OAI-PMH Framework, a new Implementation Guidelines document aimed at clarifying the important issue of how to express rights information about harvested metadata in OAI-PMH.

From the document:

Data providers might want to associate rights expressions with the metadata to indicate how it may be used, shared, and modified after it has been harvested. This specification defines how rights information pertaining to the metadata should be included in responses to OAI-PMH requests. The described technique:

  • Is based on delivering rights expressions that apply to metadata included in OAI-PMH responses. It uses the optional containers that have been defined as part of the OAI-PMH specification. As a result, no changes to the protocol are made, and compatibility with all existing OAI-PMH implementations is maintained.
  • Is not tied to any particular rights expression language. This document makes use of Creative Commons and GNU licenses, but the use of these specific languages is for illustrative purposes only.

Essential reading for OAI-PMH geeks.

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blinkbits
  • BlinkList
  • blogmarks
  • co.mments
  • connotea
  • del.icio.us
  • De.lirio.us
  • digg
  • Fark
  • feedmelinks
  • Furl
  • LinkaGoGo
  • Ma.gnolia
  • NewsVine
  • Netvouz
  • RawSugar
  • Reddit
  • scuttle
  • Shadows
  • Simpy
  • Smarking
  • Spurl
  • TailRank
  • Wists
  • YahooMyWeb

Comments closed here. Read and add comments at
http://www.digital-scholarship.org/digitalkoans/.