MEDITE is a text alignment software designed by Jean-Gabriel Ganascia that automatically identifies four types of transformations between two linear states of a text: deletions, insertions, replacements and displacements. The particularity of MEDITE is to be able to reliably identify displacements, which does not exist in any other alignment software.
To view the alignment results, the two texts are presented side by side on a graphical interface that highlights, using different colors, the inserted, deleted, replaced and moved blocks. The aligned blocks are linked together by a simple mouse click to facilitate the visualization and analysis of the results.
The MEDITE algorithm is based in particular on the suffix-tree method to detect homologous sequences between the two text states as well as on a method for optimizing the order of these sequences based on Hidden Markov models (HMM).
MEDITE is particularly useful for textual genetics work and the preparation of critical editions involving the annotation of variants.
MEDITE is currently being updated. The new version of the software will have a new interface and will allow to compare texts transcribed in XML-TEI. It will be delivered in a web version, freely accessible, and in a batch version, executable by command line, in order to facilitate its integration in digital environments.
Members
Jean-Gabriel Ganascia (Professeur, Sorbonne Université, LIP6), OBVIL et projet eBalzac
Yeugene Cavanov (Ingénieur d’étude, ISCD)
Andrea Del Lungo (Professeur, Université de Lille), projet eBalzac
Franck Deroche-Gamonet (Ingénieur, ESPE d’Aquitaine/Université de Bordeaux), projet ECRICOL
Alexandre Guilbaud (MCF, SU, IMJ-PRG), Œuvres complètes de D’Alembert
Maurice Niwese (Maitre de conférences en Sciences du langage, Université de Bordeaux), projet ECRICOL
Irène Passeron (DR, CNRS, ISCD), Œuvres complètes de D’Alembert
Côme Saignol (Doctorant, OBVIL)
Karolina Suchecka (Doctorante, Université de Lille), projet eBalzac
Bibliography
Bourdaillet J., Ganascia J.-G., Lebrave J-L., “Topologie et génétique textuelles : un dialogue médié par la machine”, Revue Lexicometrica, Numéro thématique “Topographie et topologie textuelles”, 2009, ISSN 1773-0570 [pdf]
Fenoglio I., Ganascia J-G. : « MEDITE : un logiciel pour l’approche comparative de documents de genèse », Revue Genesis, pp. 166-168, 2007
Bourdaillet J., Ganascia J-G, “Practical block sequence alignment with moves”, LATA 2007, International Conference on Language and Automata Theory and Applications, mars-avril 2007.
Bourdaillet J., Ganascia J.-G., “Alignement of Noisy Unstructured Text Data”, IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text Data, Hyderabad, India – January 8, 2007