Open Discussion about Data Transfer and Manipulation Challenges, Successes and Ideas
See bottom for instructions about adding subjects or comments.
General comments?
Caryn Anderson - 3nov07
Feel free to post thoughts about how to improve data transfer and manipulation or to discuss your challenges or to respond to something someone else is saying here.
Some data transfer and manipulation issues
Caryn Anderson - 3nov07
From Caryn Anderson's 2006 essay Electronic Resource Usage Statistics: Defining a Complex Problem(DOC)
When collecting data locally, institutions can customize their collection and delivery systems to produce the precise formats and data structure necessary for their needs. This section therefore focuses on vendor-provided statistics only. Of the vendors that do provide usage data, some offer the option to receive/review the data in HTML, TXT, XLS, CSV, or even XML (currently rare). This diversity is not bad in itself. The problem is lodged in the fact that not every vendor provides all of the same options. Some provide data in only one format.
Once data is retrieved, the next problem concerns the local data repository. Virtually all local repositories must be custom-designed to meet local uses of the data (ranging from simple spreadsheets to complex, custom databases which also include other institutional data like budget information, faculty citation statistics, departmental groups, etc.). There is great difficulty in designing these repositories, however, because of problems related to the necessary manipulation of usage data (addressed below) and fundamental data incomparability (treated in the next section on Data Incomparability).
In order to deposit all e-resource usage data into a single location (be it a spreadsheet or database), vendor-provided data must be manipulated to varying degrees. Different delivery formats (HTML, CSV, etc.) must be normalized through unique processing protocols for each format. The data must be cleaned to eliminate extraneous presentation elements (like report titles, footnotes or blank lines). The raw data itself may need to be further aggregated, disaggregated, or transposed to normalize content increments that may be inconsistent with those of the repository (e.g. data provided in daily or weekly increments may need to be aggregated to monthly totals prior to deposit). In many cases the raw data is not sufficiently meaningful and local categories must be assigned to columns/groups of data to ensure that the data gets into the right place in the repository. It is clear that some sort of XML-based data transfer protocol standard would facilitate faster and more meaningful data transfer and integration, which would in turn enable easier local repository design. There is currently no such standard, although NISO , ERMI, COUNTER and commercial vendors have begun trials to develop and test such a protocol (SUSHI – Standardized Usage Statistics Harvesting Initiative) . The COUNTER standard incorporates some delivery standards, but these standards are not entirely sufficient. In addition, the percentage of vendors that can be considered substantially COUNTER-compliant (i.e. compliant with all 5 report types) is still relatively low. This is partly due to compliance timelines (statistics calculation methods must be re-tooled by varying degrees), partly due to a lack of intention by some vendors, and partly due to the fact that journals and bibliographic citation databases, currently the only resource types addressed by the COUNTER Code of Practice, represent only two of a much larger stable of e-resources (e.g. e-books, e-reference, legal, statistics, etc.).
Add a topic/subject (by using a double exclamation point)
and/or add a horizontal line (by entering three dashes), put your name/date below and your comments/contributions below that.
PBwiki Help
- To edit the page you are on, just click "Edit Page" at the top. Add text wherever you like and click "Preview" to see how it looks. DON'T FORGET to click "Save" when you are happy with the Preview or all your edits will be lost.
- How do I...? - Wiki Style Guide
Comments (0)
You don't have permission to comment on this page.