| CARVIEW |
Select Language
HTTP/2 200
date: Sat, 17 Jan 2026 13:29:46 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
etag: W/"e4eff0190442c3be6cab15ab1d855d65"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com github.githubassets.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com wss://alive-staging.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com marketplace-screenshots.githubusercontent.com/ copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com github.githubassets.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=LIsAWvqQAHKQpBtbFo%2FZgXpASrFe1NBICd3u76SUFG3kr98awWQwMHjR%2FYYIA2lRIp1IbY3AgbMwkU9Tdb%2FDWa7vS3M4nC5aXbxypsf4Re2IQiVSxJ9ZKeHY4uCMRRWv5rM%2BfKIgx0TOp0IPEwJGuzSrjFAL%2BGJV9EbiX3VxMpDQv1DAZX2bh5xg79vtbcs2VJI%2FJkrWy1AT5cbWAP7AsLFc23x%2BSyOpjd9xfbDBCgH0jq4hthlHAc6cvQrYvQ9l8MXGfCHQTy%2FhHqdpDS5zLg%3D%3D--5HcCb62ZZAGnkm5E--VwsTSW35804bBMMKJglImQ%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.1476855454.1768656586; Path=/; Domain=github.com; Expires=Sun, 17 Jan 2027 13:29:46 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Sun, 17 Jan 2027 13:29:46 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: A038:3710FD:13C1545:171F98F:696B8ECA
Sample script to harvest metadata through CONTENTdm v6 API and format as csv for upload into ui-libraries fork of Omeka/Scripto. See ui-libraries/plugin-Scripto for documentation. Uses pycdm, a python library for working with the CONTENTdm v6 API (saverkamp/pycdm). · GitHub
Show Gist options
{{ message }}
Instantly share code, notes, and snippets.
Created
February 7, 2013 17:53
-
Star
2
(2)
You must be signed in to star a gist -
Fork
0
(0)
You must be signed in to fork a gist
-
-
Save saverkamp/4732757 to your computer and use it in GitHub Desktop.
Sample script to harvest metadata through CONTENTdm v6 API and format as csv for upload into ui-libraries fork of Omeka/Scripto. See ui-libraries/plugin-Scripto for documentation. Uses pycdm, a python library for working with the CONTENTdm v6 API (saverkamp/pycdm).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import codecs | |
| import csv | |
| import datetime | |
| import pycdm | |
| from HTMLParser import HTMLParser | |
| #get input: alias + items to retrieve | |
| alias = raw_input('collection alias: ') | |
| items = raw_input('item identifiers (separate by commas): ') | |
| ptrs = items.split(',') | |
| #current date-time for output filenames | |
| today = datetime.datetime.now().strftime('%Y-%m-%d--%H-%M') | |
| #create file-level metadata csv file | |
| fileOutput = alias + today + '_File.csv' | |
| fFile = codecs.open(fileOutput, 'wb', encoding='utf_8') | |
| wtrFile = csv.writer(fFile, delimiter=',') | |
| #header row for file-level csv file | |
| fileHeaderRow = ['filename', 'title', 'identifier', 'source', 'status', 'transcription', 'Omeka file order'] | |
| wtrFile.writerow(fileHeaderRow) | |
| #create item-level metadata csv file | |
| itemOutput = alias + today + '_Item.csv' | |
| fItem = codecs.open(itemOutput, 'wb', encoding='utf_8') | |
| wtrItem = csv.writer(fItem, delimiter=',') | |
| #header row for item-level csv file | |
| itemHeaderRow = ['title', 'identifier', 'source', 'ispartof', 'relation', 'audience', 'files'] | |
| wtrItem.writerow(itemHeaderRow) | |
| #get data for each item | |
| for ptr in ptrs: | |
| #call api for item metadata | |
| item = pycdm.item(alias, ptr, 'on') | |
| #set item-level metadata | |
| #create unique item id for use outside CDM | |
| itemID = alias + '_' + ptr | |
| source = item.refurl | |
| itemtitle = item.info['title'] | |
| #digital collection url | |
| ispartof = item.collection.url | |
| #default sorting number, maps to dc:Audience in Omeka | |
| sort = '000000' | |
| #collection guide url | |
| if ('findin' in item.info): | |
| relation = item.info['findin'] | |
| elif ('collea' in item.info): | |
| relation = item.info['collea'] | |
| #list for file locations | |
| files = [] | |
| #set counter for file order | |
| order = 1 | |
| #set file-level metadata | |
| for page in item.pages: | |
| #create unique page id for use outside CDM | |
| fileID = itemID + '_' + page.id | |
| pagelabel = page.label | |
| pageRefURL = page.refurl | |
| #set transcription, if available | |
| #assumes you have a field for full text with nickname 'full' or 'fula' | |
| if (('full' in page.info) and page.info['full']): | |
| transcription = str(page.info['full'].encode('ascii', 'ignore')) | |
| transcription = HTMLParser().unescape(transcription) | |
| elif (('fula' in page.info) and page.info['fula']): | |
| transcription = str(page.info['fula'].encode('ascii', 'ignore')) | |
| transcription = HTMLParser().unescape(transcription) | |
| else: | |
| transcription = '' | |
| #set transcription status | |
| if (transcription == ''): | |
| status = 'Not Started' | |
| else: | |
| status = 'Needs Review' | |
| url = page.fileurl | |
| files.append(url) | |
| #write file metadata to file-level csv file | |
| filerow = [url, pagelabel, fileID, pageRefURL, status, transcription.encode('ascii', 'ignore'), order] | |
| wtrFile.writerow(filerow) | |
| order += 1 | |
| #write item metadata to item-level csv file | |
| files = ','.join(files) | |
| itemrow = [itemtitle, itemID, source, ispartof, relation, sort, files] | |
| wtrItem.writerow(itemrow) | |
| print ptr | |
| fItem.close() | |
| fFile.close() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You can’t perform that action at this time.