Endings/staticSearch 2025-03-03 to 2025-03-07
to : Martin Holmes
Minutes: 95
On Thursday, met with JT and worked through the two pull requests, which were both merged after some changes and rewrites. The main problem now is the OOM error, and after comparing the JSON files from 1.4 and 2.0, we found few differences, and only one seemed like a possible trigger: we’re no longer indenting the JSON by default, which I thought might mean that serializing it might be inefficient because of the lack of linebreaks. After the meeting, I tested that solution out, but found that it didn’t solve the problem (although it did cause it to trigger in a different area of the code). The next stage is to look at the tokenized files from the two versions, to see if there’s something that’s expanding their size to a point that causes an issue.
On Friday, determined that the stem files are actually smaller in v2, due to the removal of the docId identifier. So that’s not it. Weirder and weirder.