Callisto is an annotation tool developed to support linguistic annotation of textual sources for any Unicode-supported language. It is written in Java, and The initial development of the tool by the MITRE Corporation was funded by the U. S. government.
This FAQ answers some frequent questions already. If
you do not find the answer to your question here,
consider joining the callisto-users-list
and posting your question there.
Callisto requires Java version 1.5 or better to run.
If you're developing programs, you want the SDK. To
just use Java programs you want the JRE.
Yes, you can specify the character encoding (which defaults to UTF-8) of the signal file when opening or importing. If you choose the wrong encoding, you may see your text in the wrong font, or some characters will look meaningless (Though this can also be caused by using a font that does not have all the characters in the text). You can re-read the file in a different encoding by selecting a different "Character Encoding" from the "Format" menu.
This is almost always caused by some program changing
the new-line characters automatically, while
exchanging the files. This should not occur in the
latest versions of Callisto because we now do encode the
original signal in the annotation file.
Different operating systems use different characters to represent "new-line": some use two characters, while others use only one. With stand-off annotation, if the data-files have the new-lines changed, the annotation-file must have all of it's offsets updated, or each annotation will be "off by one" for each preceding newline.
The following means are known to "auto-convert" files:
We've considered several means of automatically correcting the problem in Callisto. Unfortunately, without embedding the original data file in the standoff annotation file, it's impossible to automatically correct all problems.
That said, correction could be as easy as changing all new-lines to DOS or UNIX style. This can be done in several ways.
Conversion | Perl script |
---|---|
DOS to UNIX | perl -i -pe 's/\x0d\x0a/\x0a/g'
<filename> |
UNIX to DOS | perl -i -pe 's/\x0a/\x0d\x0a/g'
<filename> |
MAC to DOS | perl -i -pe 's/\x0d/\x0d\x0a/g'
<filename> |
The Most reliable mechanism we have found is to use the "tar" (and optionally "gzip") utilities to archive and unpack files before transferring them. Windows users can get these command line tools with the cygwin tools.
Windows users can use WinZip if the preferences are corrected on the machine where they are unpacked. Open WinZip, and open the "Options->Configuration" menu. Under the "Miscellaneous" tab, in the "Other" group, un-check the "TAR file smart CR/LF conversion" option.