Japanese Translation Example of the Archivematica AIP README File

The AIP created by Archivematica includes a README.html file. I translated this file using DeepL (with some manual corrections). There may be many errors, but I hope you find it helpful. Archivematica AIP Structure This Readme file describes the basic structure of the Archival Information Package (AIP) generated by Archivematica. Acronyms AIP = Archival Information Package METS = Metadata Encoding and Transmission Standard OAIS = Open Archival Information System PDI = Preservation Description Information ...

February 9, 2023 · 5 min · Nakamura

Trying the Archivematica API (Archivematica API - Transfer)

Overview This is the Archivematica API section of “Trying the Archivematica API.” (There is also a separate “Storage Service API” section.) https://www.archivematica.org/en/docs/archivematica-1.13/dev-manual/api/api-reference-archivematica/#api-reference-archivematica This time, I will try the following “Transfer” API. https://www.archivematica.org/en/docs/archivematica-1.13/dev-manual/api/api-reference-archivematica/#transfer Usage You can try it with the following notebook. https://colab.research.google.com/github/nakamura196/ndl_ocr/blob/main/ArchivematicaのAPIを使ってみる.ipynb The following configuration was required. The location UUID was confirmed from the storage service. ## Server settings endpoint = "http://<domain>:81/api" username = "<username>" api_key = "<API key>" location_uuid = "<location UUID>" ## Transfer settings name = "mc_api_transfer" type = "standard" accession = "2023-1234" paths = ["files/movie_test"] row_ids = [""] ## Encode to base64 import base64 paths_encoded = [] for path in paths: path_encoded = base64.b64encode(f"{location_uuid}:{path}".encode()).decode() paths_encoded.append(path_encoded) ## POST import requests data = { "name": name, "type": type, "accession": accession, "paths[]": paths_encoded, "row_ids[]": row_ids } headers = {'Authorization': f'ApiKey {username}:{api_key}'} response = requests.post(f'{endpoint}/transfer/start_transfer/', headers=headers, data=data) Summary This time I only tried Start Transfer, but APIs are provided for various operations, enabling a wide range of system integrations. ...

February 9, 2023 · 1 min · Nakamura

Setting Up Archivematica on Amazon EC2

Overview Archivematica is open-source software for long-term preservation of digital data. https://www.archivematica.org/en/ I had the opportunity to set up Archivematica on Amazon EC2, so this is a memo. Installation The installation instructions are described on the following page. https://www.archivematica.org/en/docs/archivematica-1.13/admin-manual/installation-setup/installation/installation/ There are several options, but this time I tried “CentOS 7 64-bit, Installing Archivematica on CentOS/Red Hat.” https://www.archivematica.org/en/docs/archivematica-1.13/admin-manual/installation-setup/installation/install-centos/#install-pkg-centos EC2 Instance Since CentOS 7 was specified, I selected the following Amazon Machine Image (AMI). ...

February 8, 2023 · 4 min · Nakamura

Creating IIIF Manifest Files Using a Headless CMS

Overview As a learning exercise for Headless CMS, I attempted to generate IIIF manifests from information registered in a CMS. Here are the results. (That said, the server-side processing details are not visible from the app below.) https://iiif-headless-cms.vercel.app/ This article serves as a memorandum of the above effort. Contentful https://www.contentful.com/ I created a Content model called iiif as shown below. For associating image data (url, width, height), both the “JSON object” and “Reference” fields seemed usable, but I chose “Reference” here and created a separate Content model called image to manage image data information. ...

February 3, 2023 · 2 min · Nakamura

Program for Batch Image Registration to Omeka S

Overview When batch importing metadata (items in Omeka terminology) and images (media in Omeka terminology) into Omeka S, the Bulk Import module is commonly used. https://github.com/Daniel-KM/Omeka-S-module-BulkImport However, it is also possible to register via the REST API provided by Omeka S. This article introduces a program I created for batch image registration using this API. Reason for Development The latest version of the Bulk Import module allows you to choose whether to stop or continue when an error occurs, but older versions of the module do not have this option. As a result, when batch registering images, there were cases where images were missing each time image retrieval failed. ...

February 3, 2023 · 2 min · Nakamura

Publishing Images Using IIIF Image API Level 0

Overview IIIF Image API level 0 delivers images using pre-generated static tile images. This enables image publishing using only static file hosting services such as GitHub Pages or Amazon S3. However, it has the drawback of not being able to extract arbitrary regions of images. This article introduces an example of publishing images using IIIF Image API level 0. Tool You can try it with the following notebook. https://colab.research.google.com/github/nakamura196/ndl_ocr/blob/main/IIIF_Image_API_静的ファイル作成ツール.ipynb This notebook is based on the following script. ...

January 30, 2023 · 1 min · Nakamura

Created a Program to Calculate Edit Distance for TEI/XML Files Containing app Elements

Overview I created a program to calculate edit distance for TEI/XML files containing app elements. You can use it from the following Google Colab notebook: https://colab.research.google.com/github/nakamura196/ndl_ocr/blob/main/編集距離を算出するプログラム.ipynb Upload an XML file and the program will calculate the similarity between witnesses. Example Let’s upload the following XML file: https://tei-eaj.github.io/koui/data/nakamura.xml The result is an Excel file like the following, which provides an overview of the similarity between witnesses. index name1 name2 distance ratio 0 中村式五十音 中村式五十音又様 10 0.85 1 中村式五十音 中村式五十音欠損本 7 0.8947368421052632 2 中村式五十音又様 中村式五十音欠損本 8 0.868421052631579 The following library is used for calculating similarity: ...

January 26, 2023 · 1 min · Nakamura

[Omeka S Module Introduction] BulkExport: Bulk Data Export

Overview BulkExport is a module for performing bulk data export in Omeka S. https://github.com/Daniel-KM/Omeka-S-module-BulkExport This article explains how to use this module. Installation Like other common modules, it can be installed using the standard method. Download the latest zip file from the following URL. https://github.com/Daniel-KM/Omeka-S-module-BulkExport/releases During installation, the Log module needs to be installed in advance, as shown below. Usage Click “BulkExport” in the left side of the admin screen, then click the export icon. ...

January 22, 2023 · 1 min · Nakamura

Collaborative Editing of TEI/XML Files Using Visual Studio Live Share (Not Limited to XML)

Overview Visual Studio Live Share is a VSCode extension that enables real-time collaborative development. https://visualstudio.microsoft.com/ja/services/live-share/ This time, we will try real-time collaborative editing of TEI/XML files using this extension. Demo Video A video of the collaborative editing was recorded. https://youtu.be/DzyuJAtzl90 The right side of the screen shows a user (nakamura196) using VSCode in a local environment, while the left side shows a user (Guest User) invited via Visual Studio Live Share editing using the online VSCode (vscode.dev). ...

January 19, 2023 · 3 min · Nakamura

Converting Word to TEI/XML

Overview I had an opportunity to convert Word files to TEI/XML files. Upon investigation, in addition to official TEI tools such as TEIGarage Conversion, I found a conversion example in TEI Publisher: https://teipublisher.com/exist/apps/tei-publisher/test/test.docx.xml The above example appeared to convert Word style information into TEI tags, so I tried this approach. For this project, I used the python-docx library with the goal of using it independently of TEI Publisher. Word File I created a prototype Word file like the one below. All styles are provisional, but I created styles such as “tei:persName” and “tei:warichu” and changed their visual styling such as color. The mechanism works by applying styles to perform simple structuring. ...

January 17, 2023 · 2 min · Nakamura

Trying to Add Images and a IIIF Manifest to IPFS

Overview Referencing the following tweet, I tried adding images and a IIIF manifest to IPFS. https://twitter.com/edsilv/status/1400221815369355267 For adding to IPFS, I used Fleek, which is also mentioned in the above tweet. https://fleek.co/ The following site was helpful for learning how to use Fleek. https://i-407.com/blog/m10/ Source Code The source code is below. https://github.com/nakamura196/fleek_test Steps Uploading Images First, I uploaded the following image to the above repository. https://github.com/nakamura196/fleek_test/blob/main/kunshujo_400.jpg Following the reference site, I connected this repository to Fleek. As a result, it became accessible at the following URL. ...

January 16, 2023 · 2 min · Nakamura

An Example Workflow for Creating TEI/XML from Excel

Overview I created an example workflow for generating TEI/XML from data prepared in Excel. The following TEI/XML file is output. It supports page breaks using the pb tag, line IDs using the lb tag, multiple representations using choice/orig/reg tags, annotations using the note tag, and linking with IIIF images. <?xml version="1.0" encoding="utf-8"?> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> <title/> </titleStmt> <publicationStmt> <ab/> </publicationStmt> <sourceDesc> <ab/> </sourceDesc> </fileDesc> </teiHeader> <text> <body> <pb corresp="#page_22"/> <ab> <lb xml:id="page_22-b-1"/> <seg> いつれの御時にか女御更衣あまたさふらひ <choice> <orig> 給ける <note corresp="#page_22-b-1-20" type="校異"> 給けるーたまふ河 </note> </orig> <reg> たまふ </reg> </choice> なかにいとやむことなきゝは </seg> </ab> </body> </text> <facsimile source="https://dl.ndl.go.jp/api/iiif/3437686/manifest.json"> <surface source="https://dl.ndl.go.jp/api/iiif/3437686/canvas/22" xml:id="page_22"> <label> [22] </label> <zone lrx="1126" lry="1319" ulx="1044" uly="895" xml:id="page_22-b-1-20"/> </surface> <surface source="https://dl.ndl.go.jp/api/iiif/3437686/canvas/23" xml:id="page_23"> <label> [23] </label> </surface> </facsimile> </TEI> An example of visualizing the above TEI/XML data is shown below. The image, text (original), text (regularization), and annotations are displayed on the same screen. ...

January 10, 2023 · 3 min · Nakamura

Created a Custom OpenSeaDragon Viewer for Use in TEI Viewers

Overview I created a Custom OpenSeaDragon Viewer intended for use in TEI viewers. Background In developing a viewer that links TEI and IIIF as shown below, a viewer with the following capabilities was needed. https://www.hi.u-tokyo.ac.jp/collection/digitalgallery/wakozukan/tei/ Ability to load IIIF manifest files. Ability to track page navigation within the viewer component from outside the component. Ability to highlight partial regions of images. Since I could not find an existing IIIF-compatible viewer that met all of the above requirements, I attempted to develop a custom viewer. I also tried publishing it as an npm package. ...

December 26, 2022 · 2 min · Nakamura

Script for Initial Setup of Omeka S on Amazon Lightsail (Adding the Easy Admin Module)

I created an updated version of the “Script for Initial Setup of Omeka S on Amazon Lightsail” introduced in the following article. This version adds “Easy Admin,” which makes it easy to add themes and modules, and also fixes permissions for related directories. I hope you find this helpful. # 変数 OMEKA_PATH=/home/bitnami/htdocs/omeka-s ## ハイフンは含めない DBNAME=omeka_s VERSION=3.2.3 ############# set -e mkdir $OMEKA_PATH # Omekaのダウンロード wget https://github.com/omeka/omeka-s/releases/download/v$VERSION/omeka-s-$VERSION.zip unzip -q omeka-s-$VERSION.zip mv omeka-s/* $OMEKA_PATH # .htaccessの移動 mv omeka-s/.htaccess $OMEKA_PATH # 不要なフォルダの削除 rm -rf omeka-s rm omeka-s-$VERSION.zip # 元からあったindex.htmlを削除(もし存在すれば) if [ -e $OMEKA_PATH/index.html ]; then rm $OMEKA_PATH/index.html fi # データベースの作成 cat <<EOF > sql.cnf [client] user = root password = $(cat /home/bitnami/bitnami_application_password) host = localhost EOF mysql --defaults-extra-file=sql.cnf -e "create database $DBNAME"; # Omeka Sの設定 cat <<EOF > $OMEKA_PATH/config/database.ini user = root password = $(cat bitnami_application_password) dbname = $DBNAME host = localhost EOF sudo chown -R daemon:daemon $OMEKA_PATH/files sudo apt install imagemagick -y # Module cd $OMEKA_PATH/modules ## easy admin wget https://github.com/Daniel-KM/Omeka-S-module-EasyAdmin/releases/download/3.3.7/EasyAdmin-3.3.7.zip unzip EasyAdmin-3.3.7.zip rm EasyAdmin-3.3.7.zip sudo chown -R daemon:daemon $OMEKA_PATH/files sudo chown -R daemon:daemon $OMEKA_PATH/modules sudo chown -R daemon:daemon $OMEKA_PATH/themes

December 24, 2022 · 1 min · Nakamura

Omeka S Module Development: FixCjkSearch - Fixing Full-Text Search Issues with Japanese in Omeka S

Full-text search in Japanese does not work properly with Omeka S’s standard functionality. The following article explains the details of the issue: https://nakamura196.hatenablog.com/entry/2022/03/07/083004 In the article above, I presented several workarounds, and I have now created a module that encompasses those solutions: https://github.com/nakamura196/Omeka-S-module-FixCjkSearch While this is not a fundamental fix, I hope it proves useful when working with Japanese materials in Omeka S.

December 23, 2022 · 1 min · Nakamura

[Omeka S Module Introduction] Folksonomy: Social Tagging

Overview Folksonomy is a module for implementing social tagging in Omeka S. https://omeka.org/s/modules/Folksonomy/ This article explains how to use this module. Installation It can be installed using the standard method, same as other common modules. Configuration After installation, the following settings screen is displayed. In particular, by enabling “Allow public to tag,” visitors can perform tagging. Additionally, by enabling “Require approbation for public tags,” an administrator approval workflow can be introduced. ...

December 20, 2022 · 1 min · Nakamura

Trying Out Gatsby CETEIcean

Overview I tried out Gatsby CETEIcean, created by Raffaele Viglianti. https://github.com/raffazizzi/gatsby-ceteicean-workshop Prototype Site The following is the prototype site. I have added several customizations, including MUI, vertical text display, and links to RDF data. https://nakamura196.github.io/gatsby-ceteicean-workshop/ The TEI/XML files from the “Koui Genji Monogatari Text DB” are used as the data source. https://kouigenjimonogatari.github.io/ Source Code The source code including the customizations can be found at the following link. https://github.com/nakamura196/gatsby-ceteicean-workshop Summary Using Gatsby CETEIcean, it seems possible to efficiently develop publishing environments for TEI/XML files. ...

December 20, 2022 · 1 min · Nakamura

Using the Japan Search SPARQL Endpoint with Yasgui

Overview Yasgui (Yet Another Sparql GUI) provides various advanced features for creating, sharing, and visualizing SPARQL queries and their results. https://github.com/TriplyDB/Yasgui This time, I attempt various visualizations using the Japan Search SPARQL endpoint with Yasgui. Results Table Display I visualize the number of items per dataset. First, here is a standard table display. Result Filtering and sorting of results is also possible. Chart Using the “Chart” tab, I attempt a chart display of the same results. ...

November 28, 2022 · 1 min · Nakamura

[Omeka S Module Introduction] Mapping Module

Overview This is an introduction to the “Mapping” module for integrating maps with Omeka S. https://omeka.org/s/modules/Mapping/ Installation This module can be installed using the standard method for Omeka S. Adding Location Information On the item editing screen, add location information from the “Mapping” tab. Map-based search and display are available on the public site.

November 25, 2022 · 1 min · Nakamura

[Omeka S Module Introduction] Timeline Module

Overview This is an introduction to the “Timeline” module for creating timelines in Omeka S. https://omeka.org/s/modules/Timeline/ Installation You can install this module using the standard method for Omeka S. Below is an example of the installation method. cd omeka-s/modules wget https://github.com/Daniel-KM/Omeka-s-module-Timeline/releases/download/3.4.16.3/Timeline-3.4.16.3.zip unzip Timeline-3.4.16.3.zip Usage To use this module, you need to create a page on your site. In the following example, a page named “Timeline” has been created. Then, select Timeline from “Add new block” on the right side of the screen. By default, mapping to the timeline is performed targeting values stored in dcterms:date. ...

November 24, 2022 · 1 min · Nakamura