3 Using the Kanseki Repository

Electronic texts for researchers from researchers.

Overview

The texts of the Kanseki Repository are maintained in the @kanripo user account on the Web site GitHub.com. This is the raw material for all other uses of the texts and the texts there are freely available there to everybody. In this part, the software used to interact with this repository is discussed.

The term Kanseki Repository used thus far is mainly intended as a designation for the texts made accessible on GitHub. (for technically minded readers, this can be understood as a 'back-end' of some sort); non-technical readers can imagine it as the stack room in a library where librarians will go to fetch the books ordered by the readers. Most users will encounter these texts not firstly as 'librarians', but as 'readers', therefore the perspective presented here will be mostly that of a reader. However, academics are not only readers, but also writers. Books are not only used in public libraries, but also purchased and kept in a personal library. So the dividing line is not as fixed as it might seem.

Furthermore, readers do have different ways to work with the texts in the Kanseki Repository. Again using a metaphor from the way books are used, the texts can be either read in a 'reading room', or borrowed from the library and taken back to a 'study'. Depending on the purpose and context, a reader can also switch back between these uses; in a way of speaking bringing his own books back to the reading room.

In this chapter, 'stack room' corresponds to the repository on GitHub, 'reading room' to the web application at kanripo.org and 'study' to the Mandoku application, which allows texts to be owned and worked on.

Kanripo.org

Thus, the reading room is the web application that can be visited by pointing a browser to the web address http://www.kanripo.org/. A home page similar to the one in Figure 1 will appear.

krp01-top.png

Figure 1: The start page of the Kanripo website, nicknamed the 'reading room'

From this page, there are three ways to access the texts that are stored in the stack room:

  • (1) Browsing the catalog
  • (2) Searching for titles
  • (3) Searching within the texts

Browsing the catalog

Catalog might be a slightly fancy expression for what is essentially a list of texts1.

krp02-catalog.png

Figure 2: The first page of the catalog

The top page of the catalog (Figure 2) shows a table of contents to the left and a content area (yellow background) to the right. Initially the top level headings of the catalog (部 bu, KR1 to KR6) and the second level headings (類 lei, KR1a to KR6v) are visible, currently comprising 85 items in total. Users can now browse the content by clicking on these serial numbers and immediately access the corresponding section of the catalog.

Alternatively, the top level headings are also available on the left side, providing convenient and quick access to the sections at all times. The main classification follows the structure of the Repository as outlined previously. Since most users, at least initially, will be more familiar with other collections, the main constituting collections of the Repository are also available in the lower left part. Clicking on a title will bring up a listing for this specific collection. The serial numbers are now those of the collection in question, effectively providing a view of a subset of the Repository containing only the texts that are part of the collection in view. These serial numbers are only for the orientation on the catalog page; all other pages will only display the serial numbers as used in the Kanseki Repository beginning with "KR".

As an example, browsing the catalog of the 道藏輯要 Daozang jiyao will bring up a screen similar to Figure 3. This gives the titles of texts, followed by the text’s dynasty of origin and the principal author (or other representative person, depending on the type of text), as can be seen for example in the entry for JY0032 "太上洞玄靈寶無量度人上品經法-南宋-陳春榮". The information here is given as a best estimate, mostly based on traditional cataloging. As work on the catalog proceeds, this will be updated with the latest information, and all the details, including the reasons for updates, will be available in the catalog. The different parts, "太上洞玄靈寶無量度人上 品經法", "南宋", and "陳春榮" are separated by a hyphen ("-"). If the dynasty or name is not available for some reason, this will be skipped over and the hyphen character will stand by itself, as for example in the case of JY019, "太上中道妙法蓮花經- -".

krp02a-dzjy.png

Figure 3: Beginning of the listing for the 道藏輯要 Daozang jiyao

Clicking on the text, in this case JY019, will bring up the landing page of the text, as shown in Figure 4.

krp02b-landing.png

Figure 4: Landing page for KR5h0001 太上中道妙法蓮花經

This happens to be the text with the number KR5h0001, the first text of section KR5h 續道藏. This page gives an overview of the text, including available versions and a table of contents. If no detailed overview of the inner divisions of a text is available for longer texts, the 卷 juan are enumerated and listed here. In the Repository, a juan is the basic unit in which a text is stored and accessed. Any content before the first juan, such as prefaces, introductions, etc. is placed in the file with number ending in “0.” (For technical reasons, this file exists even if there is no such preliminary content.)

To start reading the text, simply click on juan 1 in the list; this will bring up a screen similar to Figure 5. If no other version is specified, this will open the master version of the text. As previously explained, this is the most up-to-date and best curated version; it does not necessarily represent one specific documentary edition of the text, but rather the text that has been edited by the editors of the Kanseki Repository. Other versions can be accessed via the links at the top left of the page, on white background. There are five versions in this case, two versions of the Daozang, two editions of the DZJY (all these editions are "documentary") and one more edition of the DZJY, featuring punctuation created as part of the DZJY project in Kyoto. (This latter edition is interpretative.) At the bottom left of the page is a table of contents, in this case simply giving numbers for each of the 10 卷 juan of the text.

krp02c-kr5h0001-001.png

Figure 5: Beginning of the first juan of the text KR5h0001 太上中道妙法蓮花經

The area with the yellow background features the text to be read. The line breaks follow the selected edition, as do the page numbers, which are given following the pattern 001-001a, meaning juan 1, page 1, recto. The exact pattern depends on the edition, but is always in the order of "unit within the text" (juan in this case), page number and a letter indicating a subdivision of a page or something similar. Next to the page number is an icon, a small image of a woodblock printed page, which indicates that the digital text is accompanied by a scanned image of the corresponding physical page. Clicking on the page number will bring up a screen similar to Figure 6.

krp02d-kr5h0001-img1.png

Figure 6: Display of the text with facsimile of the first page of the 涵芬樓 Hanfen lou edition of KR5h0001

The right side of the screen here shows the digital facsimile of the requested page, while the left side shows the text as before. The size of the facsimile can be adjusted using the buttons above the image, + to enlarge the view and – to reduce it. The view can also be adjusted using the slider to the right of the – button. Clicking on o ⟲ (reset) will return the view to its default size. On the right there is a drop-down menu showing the name of the currently displayed version, 【正統道藏・ 涵芬樓版】 in this case. Clicking the name of the edition allows the user to select a different version, if available (Figure 7). The buttons labeled "<" and ">" to the left and right of the drop-down menu can be used to display the previous and next facsimile page, respectively. Clicking on X will hide the digital facsimile, thereby restoring the screen to its previous state.

krp02d-kr5h0001-img2.png

Figure 7: Display of the text with facsimile of the original DZJY edition of KR5h0001

Search

Title search

There are two types of search in Kanripo — title searches and content searches. A title search searches the full strings of characters used to reference a text, including the dynasty and author. The result of a title search is one or more titles from the catalog. The text or texts can then be accessed as described above for browsing the catalog.

Search within texts

The search field on the right is used to enter a search term for a full-text search. At the moment only simple searches are supported, but more sophisticated methods will become available. An example of a full-text search result is shown in Figure 8.

krp-search-qijing.png

Figure 8: Full-text search for 七經, first page

The top part with the red background gives a summary of the results, specifically the total number of matches and the matches shown on the current page—“1 to 20” in this case. The main part of the screen, with yellow background, shows the details of these first 20 results, using a so-called "keyword in context" (KWIC) format. The text number, title (sometimes truncated), and location within the text are displayed for each match. This location functions as a hyperlink, such that clicking it will open the text at the indicated location with the search term highlighted. The third item is an excerpt of the text at the location of the match, including a fixed number of characters before the search string (three in this example) and a number of characters after the match. The total length of this line is fixed; in this case there are 10 characters including the search term. The line displayed here very closely resembles the index that is used internally to locate the matches. This is useful to know, because it means that searches for strings of more than 10 characters will return no results. In practice, it is recommended that search terms be 2 to 5 or 6 characters in length. Searches for a single character will be rejected by the system, because they tend to overload the server.

The search function has been specifically designed for the Kanseki Repository. While building the index that is internally used for executing the search, all editions of a text are taken into account. Those index entries that are not from the master branch are marked in the index display with 異本 yiben 'Different version'. In addition to that, the portions of a text that are printed as notes in smaller characters are considered as a separate sequence of characters, outside of the sequence of the main text. Matches that are drawn from the text in notes are marked as 夾 註 jiazhu, 'interlinear notes' in the index display.

Search is central to research, because it is used to locate new material, but it also serves as a tool for analysis to better understand the cultural tradition as represented by the collection. There are two main ways that searches can be refined to present the material in a desirable way: (1) changing the sort order, i.e., the specific way that results are presented to the user; and (2) applying filters to display only the subset of matches that is of interest for the purpose at hand.

Changes to the sort order

The top left of the results page with the white background has links that allow the sort order to be changed to any of the following:

  • "By text number": This sorts the search results by the text number. Since the numbering follows the traditional classification, this effectively places together texts that have been considered to be closely related.
  • "By date": The date here is to be understood as the date of creation of the major part of the text. Dating texts is a very complex undertaking and in many cases there are conflicting views. The solution used here is to arrange all texts into a list. The order of the texts in this list will be used to determine the sorting3.
  • "By search term". This is the sort order that is displayed by default if no other search order is specified. It simply arranges the results according to the characters following the highlighted search term. This should ideally be the lexical order using the radical/stroke count-based ordering of the 康熙字典 Kangxi zidian. However, for technical reasons this is only approximately achieved, since the underlying character codes on which the sorting is based have been defined in several blocks, not continuously. In addition, some variant characters that are represented by images because they are not available in the character code set will sort before the other characters in that position.
  • "By preceding characters". In this case the order (subject to the same limitations as in the previous case) is based on the characters immediately preceding the search term, taking the characters into account in reverse order of reading, starting from the search term.

The other links further down the left side of the screen, grouped into several sections, all serve as filters to drill down to specific parts of the result set:

  • 朝代 Chaodai ('Dynasty') breaks down the results by dynasty and lists the most frequent matches. The numbers in parentheses are the number of hits for the specific dynasty.
  • Bu ('Section') breaks down the results according to the six fold classification, thus providing a quick overview of where the most matches are found.
  • 部/類 Bu lei ('Subsection') breaks down the results by subsection. The six most frequent matches are displayed. The top-level section names are included for reference to enable easier orientation.
  • 文獻 Wenxian ('Text') gives the top six texts, i.e., the texts with the most matches, along with the number of matches.

List of texts with matches

In addition to this KWIC display, the search results can also be viewed in a completely different way—according to text. This view can be activated by clicking "Show results by text" at the right of the area with red background. The results are then displayed in order from the texts with the highest number of matches, but the order can be changed using the links on the left.

Preparation for more advanced use

What has been explained so far is the kind of interaction with the material that is possible for any visitor to the kanripo.org website without any preparation.

For more advanced usages, many web sites offer the possibility to sign up for an account that allows personal settings and data to be preserved between sessions. The kanripo.org web site does not maintain accounts by itself, but instead asks users to log in using an account with the website github.com. No personal information whatsoever is permanently stored at kanripo.org, although some transient data is cached while the site is used. Using GitHub for this purpose has two main advantages: (1) It allows users to transparently take ownership and control of their data independently of the web site; and (2) Since the repository itself is hosted on GitHub, using a GitHub account for this purpose is the most logical way to enable settings and data to be shared with other sites where relevant. (Note that multiple web sites and services can provide access to the Kanseki Repository.)

GitHub is a commercial company based in the U.S. There is a Japanese branch as well, but so far the web site is available in English only. Accounts are free of charge if the data placed into the accounts is available for public access without limitations. It is also possible to have private repositories, but these require an account with a subscription fee.

Creating an account is quite straightforward; simply visit https://github.com/ and follow the instructions. For use with the Kanseki Repository a free account will do.

krp-github-authorize-application.png

Figure 9: Request for authorization from GitHub

Once an account is ready, it can be accessed from kanripo.org using the "Login" link on the red background at the top right of every page of the web site. If you are already logged in to GitHub when you click this link on Kanripo for the first time, GitHub will require you to authorize access from Kanripo. This is necessary, because Kanripo will interact on your behalf with GitHub. This authentication is carried out using the OAuth2 protocol, which is currently thought to be the most secure and safe way of enabling interaction between web sites. As you also can see on this page, the permission you have to give to Kanripo is very specific — allowing access only to public and private repositories (see Figure 9). No other information associated with your account will be accessible to Kanripo. It is also important to know that, if necessary, this authorization can be revoked at any time from your GitHub account. After you click on the green "Authorize application" button on the GitHub page, you will be returned to the kanripo.org website. At the same time, Kanripo will receive a token from GitHub for authenticating further interactions between the two websites.

One of the first things Kanripo will do with this token is to create a new repository under your GitHub account with the name KR-Workspace. Kanripo will use this repository to store the data you create while working on Kanripo (See Figure 10). Having this data under your own GitHub account will allow you to access the data, either directly yourself or via a software application, provided there is appropriate authentication.

krp-krptest-workspace.png

Figure 10: The workspace of user krptest

Once you are logged in, the “Login” button changes to display your GitHub user name. Clicking this button now takes you to a profile page that summarizes your account information. (Although it appears on kanripo.org, as shown here, the data actually resides in your workspace on GitHub).

Advanced uses of the Kanseki Repository

Interaction with GitHub enables information about users and their work to be saved across visits and it also allows customization of some settings. Furthermore, texts of special interest can be copied to a user’s account ("to fork a repository" in the language of GitHub) and edited there. Kanripo.org is smart enough to use a text from the user in preference to the text in the common @kanripo account if available. A user can, for example, also correct misprints and report them to the editors of the Kanseki Repository using this method. How this is done is explained further below.

Web browsers have the advantage of being readily available without the need to install additional software and they are familiar to everyone, but they also have limitations. It becomes tedious to edit or even read a text in the browser for long periods of time. For this purpose, a specialized piece of software called Mandoku has been developed, which will also be introduced below.

Kanripo.org

Here we introduce some of the features of the Kanripo.org web site that become available to users who are logged in. The first allows users to easily collect a list of texts of special interest; the second shows how to customize the "sort by date" function mentioned above.

Lists of texts

In many cases, the focus of a specific research question lies in a number of specific texts that require highly refined or specialized searching. When performing searches that generate lists of texts, the result screen will display an additional line at the top, as shown in Figure 11.

krp-save-result-list.png

Figure 11: Saving a text list now becomes possible

As can be seen, the current search term is already suggested as a filename for the list of texts that result from this search, but of course it can be changed to any other name. All texts in the list have a checkbox which is unselected by default, but clicking the line "Toggle selection" at the left will select all the texts. The "Save text list" button saves the list of selected texts to the KR-Workspace on GitHub. (In GitHub terminology, the list is "added" and "committed"). To use the list as a filter for searches on Kanripo.org, however, the list has to be loaded. This can be done by selecting "Immediately load the text list". If the list is not immediately loaded, or on a subsequent visit to the site, the text list can also be loaded from your profile, as seen in Figure 10. Here, we see a number of available text lists, one of which is even loaded. Figure 12 shows the results of a filtered search.

krp-filter-yaunshi.png

Figure 12: Searched for 全唐, but showing only texts that are in the list called 元始

Changing the list for "ordering by date"

As already explained, the “order by date” feature effectively sorts texts according to a list of texts arranged in a specific, predetermined order. This list can be inspected by selecting "Order by date" on the profile page. This will open the file for editing on the GitHub site, where the sorting order can be changed as necessary. Don't forget to save the changes by clicking on the green "Commit changes" button at the very bottom of this page. From now on, the search results of this user will be sorted in this order, according to the changed file.

Forking a text

As mentioned above, it is possible to "check out" texts of interest from the Kanseki Repository and copy them to a private "library." This action is known as "forking" the text, in the sense of a fork in a roadway from which roads lead in two separate directions. By "forking" a text, a user can edit the text independently of the text as it still exists in the Kanseki Repository. This is conceptually different from creating a branch, which adds a new version to a text. Forking makes a complete copy of all branches; it takes a snapshot of the state of all branches and records them together with the location of their origin. It is important to note that forking is a specific action of the GitHub system. After forking the new text will be available for editing in the user’s account on GitHub. Texts displayed in the kanripo.org website have a link labeled "GitHub". This link leads directly to the corresponding page on GitHub where the text can be forked. If the text has been forked already, the link will put the page in editing mode instead. Thus, this link can be used when a user wishes to make a correction of the text, for example.

There is also a different way to copy a text, referred to as "cloning.” In this case, the text is copied to the local machine from which cloning is initiated. Both the text in the @kanripo account on GitHub and the "fork" of a text in the user account can be cloned. In the latter case, users can also "push" updates made locally back to the version of the text in their online accounts. Note that a text in the @kanripo account can not be updated directly from a user account, since pushing is not allowed. Users wishing to push changes to the Kanseki Repository will have to fork the text and then make use of a different mechanism, called a "pull request", which will be explained later. A necessary condition for a pull request is the existence of a fork, so users who want to make a pull request will need to fork the text first.

Mandoku

For intensive work with texts, such as close reading or translation, we recommend the use of Mandoku. This application allows users to download texts of interest and also edit them offline (while not connected to the Internet).

Mandoku is based on Emacs, a very sophisticated text editor that has been in active development for over 30 years. Some of the concepts and the terminology to describe them predates current software and might seem unfamiliar. It will take some time to get used to it, but that time is well invested.

Here we describe how to install Mandoku on Windows, Mac OS X and Linux and then give an overview of some of the program’s functions.

Requirements:

  • On all platforms, git and python are required for the installation process and software updates, and to run the program Emacs is also needed, of course.
    • A working network connection.
    • About 400 MB of free space on the harddisk used for installation4.

Installation5

Windows

There are three ways to install Mandoku on computers running the MS Windows operating system:

  • (1) Using a "bundle" that contains all necessary files and programs.
  • (2) Installing all components separately and adding the Mandoku package for Emacs manually.
  • (3) Installing the gnupack distribution and then proceeding with the relevant instructions for Linux6.

For users as yet unfamiliar with Emacs, the first method is recommended. Users who already have a relatively recent version of Emacs (24.4 or newer is recommended) are best served by the second method. Method (3) provides an environment that is well suited to working with Japanese and Chinese, but the exchange of files with other Windows programs requires some rather advanced technical knowledge, so it is not recommended for users who are not already familiar with this kind of distribution. Here below, I will mainly discuss the first method.

A distribution "bundle" with all necessary files to start using Mandoku is available from the Mandoku website

Installing the installation bundle for Windows

The bundle for Windows has been tested on Windows 8 and 10. The bundle may also work on earlier versions of Windows, but it has not been tested on them. The Emacs application included in the Mandoku bundle below is version 24.5 from ntemacs. This version is known to work with input methods for Chinese and Japanese.

krp-win-install1.png

Figure 13: Dragging the folder krp to the root of drive C:

  1. The Mandoku bundle includes a version of the git program, which is used to download and update necessary files.
  2. Download the Mandoku installation bundle mandoku-2016-03.zip (200 MB).
  3. Double-click the archive file then move the resulting krp folder to a root directory of your computer, e.g., C: or D:, as shown in Figure 13. The folder can also be moved to a USB memory stick or external hard disk. Do not copy it to your user folder because it may not work correctly there.
  4. Inside the folder krp, navigate to the bin folder then click on the file "start-mandoku", which is a batch file that configures the operating environment and launches Emacs. This file can also be accessed via a shortcut placed on the Desktop, so you can easily find it again next time. To create such a shortcut, right-click on the file "start-mandoku" and select "Send-to", then "Desktop (create shortcut)". You will then be able to use that shortcut to start Emacs (and Mandoku) next time. Alternatively, you can also pin the shortcut to the taskbar for convenient access.

Installing Mandoku on a Macintosh computer

There is no need for an installation bundle for computers running Mac OS X 10.9 or newer, since everything except Emacs is already available in the standard installation of the operating system. To install Mandoku, use version 24.4 or later of Emacs. (The version containing the patch by Yamamoto Mitsuharu 山本光晴 of Chiba University, available at macemacs, is recommended, because it supports the whole range of Unicode.) Simply download a zip archive from this link, click on the downloaded file to unpack it, then drag the file Emacs to your Applications folder.

Once you have Emacs installed, you need to install and activate the Mandoku package. You can do this in different ways; two of these are outlined here. Use whichever method seems easier to you.

  • Activating Mandoku: Method 1

    One simple way of activating Mandoku is to download the file activate-mandoku.el mdactiv, open it in Emacs and then execute it by opening the menu item "Emacs Lisp" and clicking on "Evaluate Buffer.” Emacs will now start to install mandoku and all other packages that are required by it.

  • Activating Mandoku: Method 2

    Open Emacs and find the "scratch" buffer. This buffer might not be listed on the Buffer menu, but you can always find it in the Buffer list, which is available on the Buffer menu under "List all buffers". Once in the scratch buffer, copy the following code and paste it there:

    (progn (require 'package)
     (add-to-list 'package-archives
    	     '("melpa" . "http://melpa.org/packages/"))
     (package-refresh-contents)
     (package-initialize)
     (package-install 'mandoku))
    

    Next, move the cursor to the very end of the buffer, after the last closing parenthesis, ")", and finally press "Control-x" followed by "Control-e." This should get things going: Emacs will install Mandoku and other packages that are required.

Installing Mandoku on a Linux computer

To use Mandoku, you will need to install Emacs version 24.4 or later. How this is done depends on your system. Usually there will be a package manager that can be used for this task7.

Once Emacs is installed in a sufficiently recent version, simply proceed with adding the Mandoku package as detailed above for Macintosh-based computers using either Method 1 or Method 2.

Starting Mandoku for the first time on Macintosh or Linux

With the bundle for Windows, Emacs is set up to automatically start Mandoku when the program starts. On other systems, however, Mandoku has to be launched by the user. To start using Mandoku for the first time, you will have to manually ask it to show the catalog. This is done by typing "M-x." (On Mac OS X, “M” here usually means the “Command” key, but in some configurations it could also be the key labelled "Option" or "alt". Just try to press each of these keys at the same time as "x" until you see "M-x" appear at the bottom of the Emacs application window). Once you release the keys, you will see a prompt at the bottom of the Emacs window; this is called the minibuffer. At the prompt, now type "mandoku-show-catalog", followed by the Enter key. This will initialize the mandoku package. This process will also ensure that the catalog will load right away the next time, so that you will not need to go through this process again.

Mandoku will now ask you where you want to place the files related to use of the Kanseki Repository. This will mostly be texts, but also other data files, data you produce while working with texts, etc. The program will suggest "~/krp" as the default location, i.e., the folder "krp" in your home directory. You can either accept this by pressing Enter, or specify a different path. From this point onwards the process is the same for all systems, as described in the next section.

Connecting to GitHub and installing the workspace

For some time Emacs will go on downloading and installing more packages that are needed for operation. Some messages that report on the progress of these activities will appear on the screen. Eventually, when download and installation is complete, Emacs will connect to GitHub and download your workspace from there (if it exists).

Emacs will now ask you for your GitHub account name and password. This will be used to ask the GitHub site to create an authentication "token" for you. This token is stored and used for subsequent access from the particular computer you are using. Note that you will need to enter the password only once and it will not be stored; only the token will be stored. The startup process might take a while, but this is only necessary for the first time. If you do not want to log in to GitHub now, you can skip this step and start using Mandoku right away. Using your GitHub credentials will be necessary later when you want to download texts from the Kanseki Repository for local use.

Layout of the krp folder on the computer

Mandoku will store all of the files that are created, edited or downloaded during its operation in a folder hierarchy that usually has the name krp. On Windows, this is generally placed in the root directory, while on other systems it is usually in the user's home directory. In this section, the layout and meaning of this folder will be described.

Table 1: The folders in krp - 図# krp の中のフォルダー
name edit description 備考
KR-Gaiji part List of non-system characters 外字表と画像
KR-Workspace yes Workspace shared with website ワークスペース、ウエブサイトと共用
images no Facsimile images デジタル・ファクシミリ
index no Index for local files ローカル・テキストのインデックス
meta part Catalog files 目録
system no Some files used by mandoku マンドクのシステムファイル
temp yes Temporary files, can be deleted occasionally 臨時的な物、削除可能
text yes Texts downloaded for local use KRからコピーされたテキスト
work yes Additional files not from KR ユーザーのファイル
bin no Windows only: Emacs, git, python and other programs ウィンドウズのみ:Emacs、gitなど

Table 1 shows a typical list of the folders present in the krp folder, together with a short description. Their status (whether and to what extent they are editable) is indicated in the second column, "edit". Some of the folders are used only by the system, so users should not edit their contents directly, to prevent problems arising from accidently deleting or corrupting a file. Thus, edit is set to "no". Other folders contain both files used by the system as well as files that a user can edit if necessary. The meta and KR-Gaiji folders are examples of this.

Users will usually edit some or all of the files in the KR-Workspace, text, and work files. The KR-Workspace is a copy of the folder of the same name on GitHub and usually these two folders will be kept synchronized. 8. text is where the text files downloaded from the Kanseki Repository are stored. The texts contained in this folder can be edited by users as needed. work, on the other hand, is a folder in which users can place their own files, to make them available to the system. This is an advanced usage that is described in the online documentation.

Using Mandoku

Mandoku is a package that extends the functionality of the editor Emacs. This means that all of the core functions of Emacs, as well as any other extension users have installed are all available while using Mandoku. Altogether, this puts a very powerful system at your service. Using Mandoku might seem daunting at first and it will take time to get used to the way things are done in Emacs, but it is definitely worth the effort and time. This is not the place for a general introduction to Emacs, however, so here below discussion will be limited to introducing some of the main functions available in Mandoku.

krp-mandoku-read.png

Figure 14: A text opened for browsing in a temporary location

Browsing

As on the web site, Mandoku allows the user to browse the contents of the Repository. This can be done easily from the catalog9 by clicking on the link "Kanseki Repository 漢籍リポジトリ" or by positioning the cursor on any of the underlined, blue characters making up this phrase and then pressing the "Enter" key. This will display the familiar list of six main divisions in the catalog and by activating any of the links the desired subsection of the catalog can be viewed. A list of texts will be displayed, from which texts can be accessed for reading, as shown in Figure 14.

It is important to note here that the text has been downloaded from the server to a temporary location to allow browsing of the text. If the text is going to be read, it should be downloaded and made available in the private local library of the user. This is the meaning of the second line of the text shown in Figure 14. "# Don't edit this file. If you want to edit, press Control-c d to download it first."

To download the file, do as follows. Press the “Control” and "c" keys together, release both and then press "d". This will initiate the download process. At this point, your GitHub credentials (username and password) are needed. If you have entered them previously you should not need to enter anything here, because a token to authenticate access to your GitHub account will already have been generated.

github-fork-remote.png

Figure 15: Situation after cloning, forking and adding remote of text KR5h0001

Once the download is completed, Mandoku will prompt you with the question "Fork repository and add remote?" As explained above, "fork" is a GitHub term for a copy ("clone") of a text in the GitHub user account. If you answer "yes" to this question, a fork will be created and a reference to this forked text will be added to the text that was cloned. Such a reference is called a "remote". Answering “yes”, results in the situation illustrated in Figure 15. Altogether there are now three distinct instances of the text. The first is owned by the user @kanripo and a second by the current user (@cwittern in this case). Both of these copies are located on GitHub, i.e., in the "cloud". The third copy is on the computer of the current user, cwittern. The location from which a text has been cloned is usually recorded as a remote reference named "Origin". Answering “yes” to the above question generated a fork, resulting in another remote, this time under the name "cwittern". That is, the name of the GitHub user serves as the name of the remote in this case. If the answer to the question had been "no", no fork would have been created, and no additional remote added. Thus the text in @cwittern's user account on GitHub would not exist and there would be just two copies of the text.

krp-magit-show-refs.png

Figure 16: Text (upper half) and references in Magit branch manager (lower half) for text KR5h0001 -

The situation can also be confirmed from within Mandoku. The Emacs package Magit, which is installed together with Mandoku, is used to interact with git. It is configured to be activated by pressing "Control-x” followed by g". (This displays an overview screen.) Pressing "y" now displays the branch manager10. If Magit is called while viewing newly downloaded text, the display appears similar to Figure 16. As can be seen, all available branches for the remotes are displayed. The cursor keys can be used to move to any of the branches to view it. This retrieves and displays the text of this branch.

Searching for titles

A special function key, "F7" has been set up for searching titles. After pressing this key, a prompt will be displayed at the bottom of the screen: "Mandoku | Search for title containing: ". Simply enter some characters and press Enter. A search will be conducted of all titles containing the entered characters. The list of titles displayed is the same as on the web site and the names of dynasties and authors are similarly included.

From the list of titles displayed in the search results, the desired one can be selected by moving the cursor to the line and pressing Enter. Alternatively, the mouse can be used to click on a title.

krp-mandoku-taiping.png

Figure 17: The Mandoku Index for 太平經

Searching within the texts

Pressing "F6" will initiate a full-text search for a specified character string. This command is intended to be used when reading a text and trying to find other occurrences of a term in the text that is displayed. This command will take the six characters starting from the cursor position and display them after the prompt "Search for: ". This search string can than be edited as needed. The search can then be initiated by pressing enter.

The resulting display, in this example for the term 太平經, is similar to the one on the web site as shown in Figure 17. In this case there are 406 matches. The location is a link that can be followed to display the text. Emacs will show up to 2000 hits in the default setting.

A very short crash curse on git: commit, push and pull

The git software is used to keep track of branches and versions of the texts on your computer. It also works behind the scenes on GitHub. It is important to follow some simple rules when using it.

Commit

Once a file has been edited and saved to the computer, git needs to be told about this change. This action is called "to commit this file". As a result of this action, the current state will be saved to git's internal database (the git term for this is "stage") to be retrieved later. It is now possible to change to another branch as explained above. In Magit (activated by pressing "Control-x” then g" from any text downloaded from the repository), the commit action is initiated by pressing "c". A pop-up window with instructions on how to proceed will then be displayed.

Push

After a change has been committed, the changed file is still only available on the local computer in its current state. Any connected folder (a "remote" as git calls it) will need to be informed about the changes, so that they are available there as well. This action is referred to as a "push" to the remote. In Magit this is done by pressing "p" and then following the instructions. You will need to select the remote and the branch, but usually the defaults will be fine. As a result, the change will appear on the remote repository as well, for example on the connected folder in the GitHub user account.

Pull

Pull is the inverse of push. it will check the connected remote folder to see if there are any changes that need to be incorporated into the branch that is currently on display. In Magit, this is initiated using the "F" key.

Too complicated? Automate all of this

If this sounds a bit complicated and daunting, you can relax. Mandoku has a setting that can automate all this for you. You can enable or disable this setting on the Mandoku menu. It works by periodically committing all relevant files, pushing them to the remote if necessary, and pulling changes from the remote. If no Internet connection is available, this will be done later once a connection is available again. This function will also automatically get all your texts from your GitHub user account if they are not already downloaded. While this will work most of the time, there could be situations when something goes wrong. In this event, it is good to know at least a little about what is going on and what could have gone wrong. If in doubt, you can go to the mailing list for Mandoku users11 and ask a question there.

\printbibliography[title=References, nottype=misc]

\printbibliography[title=Online References,type=misc]

Footnotes:

1
In fact, there is also a proper catalog with detailed information about each text. It is currently available only in the “stack room” at kr-catalog, but in the future it will become more integrated into the “reading room.”
2
Where available, the serial numbers of the project in question have been used. If no such number is available, such as in the case of the DZJY, other widely used numbers have been added; in this case the serial number used by the Daozang Jiyao Project.
3
Due to the importance of this for some research questions, it has been made possible to adapt this list for specific needs. This is one of the more advanced usages which is explained later.
4
In some cases, a portable version can also be used on removable media, such as USB memory drives.
5
These installation instructions are also available online at mandoku-install. Since the instructions there will be more up-to-date, with links to required items, it is advisable to visit that page when installing the program.
6
This installs cygwin and additional supporting software, including Emacs, in a way that essentially provides an operating environment similar to Linux and Mac OS X. gnupack is available from gnupack. Be sure to use the "development" version.
7
Many systems based on Debian, such as all Ubuntu systems and all Linux Mint versions come with fairly old versions of Emacs by default. If you are using such a system, it is advisable to build Emacs from the released tarball or the git repository. You can prepare for this by first installing the package build-essential and then issuing the command "sudo apt-get build-dep emacs24". The whole process is described here, for example at emacs24ub. This method is also valid for later versions of Ubuntu.
8
For an explanation of how this is done, please see below "Commit, push and pull".
9
If the catalog is not visible, there are several ways to display it. First look in the "Buffers" menu for an entry called "mandoku-catalog.txt". Or, if the "Mandoku" item is visible in the menu, use "Browse->Show catalog". Otherwise, the easiest way is as described above under "Starting Mondoku for the first time", i.e., using M-x mandoku-show-catalog.
10
A help screen for Magit with the most important commands can be called by pressing the key "?", Control-h i displays Emacs manuals, which includes one for Magit.
11
The mailing list is open to all users and all kinds of questions related to the use of Mandoku or Kanripo can be asked. The group can be found here: krgroup.

Bibliography