The files describing the reconstruction of a phylogeny for a set of taxa (phylogeny.lp), and the preprocessed datasets used in our experiments (Alcataenia, Chinese, Indo-European) are enclosed in cladistics-basic.tar.gz.
With these files, we can generate phylogenies with at most n incompatible characters, for Chinese dialects, Indo-European languages, and Alcataenia species, using cmodels (with zchaff). For instance, a phylogeny for Chinese dialects, with at most 6 incompatible characters, can be generated by the command
For a more efficient computation, we modify the phylogeny program above according to the heuristics described in Section 4 of [Brooks et al., 2005]. This program is presented in the file phylogeny-improved.lp. For instance, a phylogeny for Indo-European language groups, with at most 16 incompatible characters, can be generated by the command
(The reason why we use n=9 instead of n=16 is due to preprocessing, and is explained in Section 7 of [Brooks et al., 2005].)
The datasets before preprocessing are presented in the files Chinese-unprocessed, Alcataenia-unprocessed, Indo-European-unprocessed.