Permalink 04:09:09 pm, by mholmes, 232 words, 6 views   English (CA)
Categories: Activity log; Mins. worked: 240

Developing local search engine

One of our project goals is to investigate the practicality of developing a local search "engine" which does not require any server-side support, to find out whether it is possible to do this, and if so, how large a site can be before it's impractical. Today I did the first half of this work, with the Keats site as a pilot (because it's modern(ish) English, it's of a size which is not huge but not trivial, and it doesn't have any back-end and probably shouldn't). This is what I've got so far:

  • XSLT tokenizes all the content files, duplicating some bits to create simplistic weighting. I attempt to preserve proper names by retaining capitalization for all words which don't appear in a small English word-list (40,000 words).
  • A python Porter stemmer stems all the non-proper-name tokens.
  • XSLT amalgamates all the token-counts and their source documents.
  • XSLT generates a separate JSON file for each token, containing a list of all documents containing it, and how many hits there are in that document.

Next, we write a search engine interface in which we use JavaScript to:

  • stem each search term (unless it's a proper name). I've found a JS implementation of the Porter stemmer.
  • retrieve the JSON files for each of the search terms
  • unify them to get hit counts for each individual document
  • display (paged?) results

Should be doable in a few hours.


Permalink 04:57:12 pm, by mholmes, 21 words, 19 views   English (CA)
Categories: Activity log; Mins. worked: 90

Endings: first interview

With LG, EC, and MH, did the first Endings interview, and got a good hour's rich material. Made two backup recordings.


Permalink 02:59:38 pm, by mholmes, 13 words, 16 views   English (CA)
Categories: Activity log; Mins. worked: 60

Endings meeting

Project meeting; wrote up the notes afterwards and added them to the repo.


Permalink 04:27:24 pm, by mholmes, 13 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 60

Finishing up article

With JT, working on the article for submission to DSH. I'll submit tomorrow.


Permalink 05:01:49 pm, by mholmes, 38 words, 35 views   English (CA)
Categories: Activity log; Mins. worked: 180

Meeting, discussion, workshop outline

During the meeting we brainstormed around the plan for the DHSI course, and then JJ and I drafted it, including a day-by-day plan. It's now with the group for feedback; will submit at the end of the week.


Permalink 11:41:14 am, by sarneil, 135 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 30

LimeSurvey as plan B for survey

If FluidSurvey isn't viable, we're looking at LimeSurvey.org I've emailed to see if their hosting is on non-American servers. If we decide to host ourselves, their specifications are: - MySQL 5.5.3 or later OR Microsoft SQL Server 2005 or later OR Postgres 9 or later. - Minimum PHP 5.5.9 or later; however, we recommend PHP 7.0.0+ with the following modules/libraries enabled: - mbstring (Multibyte String Functions) extension library. - PDO database driver for MySQL (pdo_mysql or pdo_mysqli) or Postgres (pdo_pgsql) or MSSQL (pdo_sqlsrv for Windows and pdo_dblib for Linux). - Also, we assume in general that all PHP default libraries are enabled (like hash, session etc.). Full details at: https://manual.limesurvey.org/Installation_-_LimeSurvey_CEMinimum 180 MB disk space. Our production environment does not meet those requirements, but the webserver2 environment does.


Permalink 04:43:38 pm, by mholmes, 53 words, 35 views   English (CA)
Categories: Activity log; Mins. worked: 75

Meeting and plans

Reviewed CC's Kula submission. At the meeting we agreed to hire EC to run the survey, and SA will work with her to get it set up. Emails will come from an address we set up for the project. Talked about the DHSI 2019 plan; we'll finalize it at our next meeting on Dec 18.


Permalink 04:43:09 pm, by mholmes, 34 words, 43 views   English (CA)
Categories: Activity log; Mins. worked: 60

Project meeting; submission of abstract

Monthly project meeting; submitted DPASSH abstract to SM at Kula, and they are interested, so SA and I will expand it into a full article after the TEI conference is out of the way.


Permalink 05:13:07 pm, by mholmes, 47 words, 45 views   English (CA)
Categories: Activity log; Mins. worked: 60

Project meeting

With key people absent we didn't have a lot to talk about, but top of the list was updating the OAC website, and revisiting our grant application to make sure we're going to be able to meet all our promised outputs in terms of presentations and publications.


Permalink 04:41:29 pm, by mholmes, 22 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 120

Expenses form

Finally got around to doing the expenses form from the conference; now it's ready to sign, but there's no-one to sign it.

:: Next Page >>


Blog for the SSHRC-funded Endings project, 2016-2020


XML Feeds