Log in

HCMC Journal

Monument 2026-01-26 to 2026-01-30

to : Martin Holmes
Minutes: 865

On Monday, reworked the processing that generates the individual place pages so that we now have landing pages for the locations whose people are split into subgroups by surname; this makes linking around the site much easier.

On Tuesday, I created the community-listing page and code to generate it, and added a few more translations to help get a sense of how long the various captions are likely to be. I then started thinking about the search pages and functionality, and of course this brings up the question of tokenizing and possibly stemming Japanese. This is a first shot at my left-field idea of tokenizing on writing-system boundaries (in XPath):

tokenize(
  replace(
    replace(
      replace(
        replace(//h1/span[@lang='ja']/text()[1], '\p{P}', '|'), 
        '([\p{IsCJKUnifiedIdeographs}])([\p{IsHiragana}\p{IsKatakana}])|([\p{IsHiragana}\p{IsKatakana}])([\p{IsCJKUnifiedIdeographs}])', 
        '$1$3|$2$4'
      ),
      '([\p{IsHiragana}])([\p{IsKatakana}])', 
      '$1|$2'
    ),
    '([\p{IsKatakana}])([\p{IsHiragana}])', 
    '$1|$2'
  ), 
  '\|+'
)
    

This will split e.g. ここにホームページのバナーが表示されます into:

I set up the initial English search with the new release of staticSearch, but there is a lot more configuration to do in the headers of the person pages, along with the page titles, to get things working properly.

On Wednesday, I wrote code to extract the coordinates from GN’s GeoJSON and add them into our place dataset, and then wrote the code to generate the required JSON and GeoJSON from the map. This involved small changes to nomenclature in both my code and his, to standardize on these levels (bottom up): Communities, Areas, Subregions, Regions. The GeoJSON is valid, but there are probably going to be some things to iron out to get the actual map working.

On Thursday, turned the map tweaking back over to GN and focused on other tweaks such as getting the relative links right throughout the site. Also wrote generation code for a JavaScript caption object, to be used by JS running the map.

On Friday, after getting a new local server up and running, I worked on adding a few more of the base pages and some suggested translations, along with adding code to filter out the unnecessary JS and CSS that the map needs from all the other pages, so that there’s no interference from it.