Questions re sorting and orthography
Posted by mholmes on 15 Mar 2013 in Activity log
We sort our Moses entries currently based on the phonemic representation, using a Java comparator I wrote specifically for the project. Now we're going to have orthographical representations, the sort order will have to be amended to take account of that. I'm therefore reviving the NetBeans project for the MosesCollation, and beginning to update it.
This is the current sort order.
We'll need to add the following characters to the list:
- č (u010d) (does it sort before or after c?)
- š (u0161) (does it sort before or after s?)
- x̌ (x + u0323) (does it sort before or after x?)