This has now been done. All logbooks use the same format now and we have only one parser.
There is an advnatgae in using a "separator format" rather than a "encapsulated entry format". When parsing the logbook.html file, everthing will be in one of the entries if we use a separator (e.g. <hr> as opposed to a <article> ... </article> encapsulation). Stuff between encapsulations is probably meant to be in an adjacent entry. So we are continuing to use the <hr> separator format style.
We have had 4 different markdown and HTML formats for logbooks of different vintages. This means 4x as much maintenance as we need.
LOGBOOK_PARSER_SETTINGS = {
"2010": ("logbook.html", "Parseloghtmltxt"),
"2009": ("2009logbook.txt", "Parselogwikitxt"),
"2008": ("2008logbook.txt", "Parselogwikitxt"),
"2007": ("logbook.html", "Parseloghtmltxt"),
"2006": ("logbook.html", "Parseloghtmltxt"),
# "2006": ("logbook/logbook_06.txt", "Parselogwikitxt"),
"2006": ("logbook.html", "Parseloghtmltxt"),
"2005": ("logbook.html", "Parseloghtmltxt"),
"2004": ("logbook.html", "Parseloghtmltxt"),
"2003": ("logbook.html", "Parseloghtml03"),
"2002": ("logbook.html", "Parseloghtmltxt"),
"2001": ("log.htm", "Parseloghtml01"),
"2000": ("log.htm", "Parseloghtml01"),
"1999": ("log.htm", "Parseloghtml01"),
"1998": ("log.htm", "Parseloghtml01"),
"1997": ("log.htm", "Parseloghtml01"),
"1996": ("log.htm", "Parseloghtml01"),
"1995": ("log.htm", "Parseloghtml01"),
"1994": ("log.htm", "Parseloghtml01"),
"1993": ("log.htm", "Parseloghtml01"),
"1992": ("log.htm", "Parseloghtml01"),
"1991": ("log.htm", "Parseloghtml01"),
"1990": ("log.htm", "Parseloghtml01"),
"1989": ("log.htm", "Parseloghtml01"), #crashes MySQL
"1988": ("log.htm", "Parseloghtml01"), #crashes MySQL
"1987": ("log.htm", "Parseloghtml01"), #crashes MySQL
"1985": ("log.htm", "Parseloghtml01"),
"1984": ("log.htm", "Parseloghtml01"),
"1983": ("log.htm", "Parseloghtml01"),
"1982": ("log.htm", "Parseloghtml01"),
}
Secondly, it is highly likely that most of the different parsers have errors and so some logbook entries do not get imported. One parser, which we could devote more effort to, would mean data does not get mislaid.
Thirdly, the current format is error-prone and nonsensical, so it an unecessary learning curve for all expoers.
There are several HTML structural tags we could choose,
see HTML5 structural elements.
DIV, SECTION, ARTICLE, ASIDE