Juxta-cl
Forked version of the Juxta Command Line tool created for eMOP by Performant Software Solutions. Will be official after eMOP is complete (10/1/14).
Juxta-CL is available open-source for use under the Apache Software License, v2.0.
+
+
Juxta Command Line (Juxta CL)
Synopsis
JuxtaCL is a specialized form of Juxta. It is a command-line tool that accepts the path to two files a parameters. It will collate them and return their change index (degree of difference between the two files).
Requirements
Java 1.6+ Maven 3.x
Build
JuxtaCL is built by maven. Execute: mvn package
Usage
Once built, the binary distribution can be found in the target directory. It will be named: juxta-{version}-bin.tar.gz. Expand this archive and move into the top level directory. It contains a script for launching JuxtaCL named juxta.sh. It accepts the following command line arguments:
-help - displays usage information
-version - prints the JuxtaCL version
-strip
Valid options include:
[+|-]case - toggles case sensitivity. Default: insensitive
[+|-]punct - toggles punctuation sensitivity Default: insensitive
-hyphen [all|linebreak|none] - sets hyphenation handling Default: all
-algorithm - set the algorithm used [ juxta | to determine percent levenshtein | differnece betweenn the jaro_winkler | files. Defaults to juxta. dice_sorensen ]
Also included in the final package are helper scrips; strip.sh. diff.sh and all.sh.
strip.sh takes one XML file as a parameter. Flat text is dumped to std:out
diff.sh takes 2 files. The change index (as calculated with Juxta algorigthm) is returned. The -algorithm paramter is also accepted.
all.sh takes 2 file parameters. It will return a table of change indexes, one row for each algorithm.
Installing
Juxta-CL is a java based tool and so can be run on any platform that is loaded with the Java SE Developer's Kit. To install Juxta-CL without building it yourself:
- Download and unzip the Juxta-CL.zip file.
- For Windows users, download/copy the juxta.bat file, and put it in your new Juxta-CL folder.
- type sh juxta.sh or juxta.bat for Juxta-CL help information