This page is a checklist who are helping the survey data reduction process during and after expo.
All of these things should be checked every week for a couple of months after expo, and then every couple of months. Then again when preparing for the new expo in May.
We do this regularly during software maintenance to check that everything is still working. See the database reset documentation. But it must be done:
This is much less of a problem now that we have nearly all the file uploading done by troggle forms.
The most common UTF8 problem is with files uploaded containing German language umlaut characters which have been encoded using an extended-ASCII code such as ISO-8055-1 or Windows-1251. (All umlauts in webpages and logbook entries should be using ¨. So this is an issue mostly with survex files and survey files such as topo or tunnel.)
To fix EOL problems, use dos2unix to convert any uploaded Windows text files to the format expected by our software. e.g.
cd expofiles/surveyscans
find . -not -type d -exec file "{}" ";" | grep CRLF >crlf.txt
`awk -F: '// {print "dos2unix \"" $1 "\""}' crlf.txt`
Also a good idea to run on all of expofiles once every few years as many GPX exports The dataset is kept with unix linefeed style. DOS (and mac) files get checked-in regularly, and from time to time someone uses an editor so dim that that it makes files mixed-lineend.
This handy command will unixfy all the DOS-style .svx files int he :loser: repository:
find . -not -type d -name "*.svx" -exec file "{}" ";" | grep CRLF |
awk '{print $1}' | sed -e 's/:$//' | xargs fromdos -v
It needs 'tofrodos' package installed. 'unix2dos' can be used instead.
See manual for more on encoding conventions for cave names, filenames and HTML formatting.
A user-editable online to-do list for data management is now part of the expo online systems. Review this regularly to see what needs doing, and please *delete* jobs that have been done.
The #00 wallet directory (e.g. /2020/2020#00/ ) contains orphan files that have been found on the expo laptop in odd places, or have been scanned from bits of notebook found inside other documents. Keep an eye on it and re-file the contents as you discover what they are. from phones are a bit variable in how they do EOL characters.
As the caves get written up (i.e. as survex files are written), run the QM reports on the updated caves to check that the QM data appears correctly. Check the DataIssues page for import error messages. See the detailed QM instructions.
This is now obsolete:
run svx2qm.py and find-dead-qms.py to check that the QMs have all be entered correctly into the svx files and that thecave descriptions have been updated with (a) the new open QMs and (b) the old closed QMs.
You can now upload a new survex file for an existing cave without doing any other preparations.
This is now obsolete
Look at the valid SVX refs page to check that new svx files properly reference the wallet folders, and create the wallet folder link back to the svx if the contents.json file in the wallet folder needs updating.
During prep. for the new expo the folklist will be updated with all the new people expected, but after expo the mugshots and blurb text for the new people will need to be added. See folkupdate for the procedure.