# INEL Corpus Services ### How to run it through CLI An example: > java -Xmx3g -jar /path/to/corpus-services.jar -i /path/to/corpus -o path/to/corpus/curation/report-output.html -c INELChecks -f -p "fsm=/path/to/corpus/corpus-utilities/segmentation.fsm" **Available options** *-i*, *--input* Required. The path to source file(s) you want to perform an action on. > -i /path/to/corpus If it's a path to a directory, then the action will be applied to all the eligible files within the directory and all subdirectories. > -i /path/to/corpus/selkup.coma If it's a path to a file, then the action will be applied to that file only. *-o*, *--output* Optional. The path to a report file (HTML or JSON) containing warnings and errors found in the source data. If this option is not provided, no report will be made. > -o path/to/corpus/curation/report-output.html Produces an HTML version of the report that can be viewed in a browser. > -o path/to/corpus/curation/report.json Produces a JSON version of the report. > -o path/to/corpus/curation/report-output.html -o path/to/corpus/curation/report.json You may provide the option twice to produce both versions of the report. *-c*, *--corpusfunction* / *-u*, *--utilityfunction* Optional. The name of the function you want to run AKA the case-sensitive name of the respective java class from .validation or .utilities package. If neither *-c* nor *-u* is provided, Corpus Services will do nothing. Currently *-c* and *-u* perform the same actions and are thus interchangeable, thought that may be subject to change in the future. > -i selkup.coma -c ComaApostropheChecker Will call ComaApostropheChecker on the comafile. > -i selkup.coma -u PrettyPrintData Will call PrettyPrintData on the comafile. > -i selkup.coma -c ComaApostropheChecker -c ComaAttachedFilepathsChecker -c ComaFileCoverageChecker You may chain the option to run several checking classes during the same run. > -c INELChecks A useful shortcut to run all the functions from the .validation package. *-f*, *--fix* Optional, boolean. If selected, Corpus Services will automatically fix some errors where possible and rewrite the source files. If not, Corpus Services will collect issues to be written in a report, and no changes to the source files will be made. *-p*, *--property* Optional. Some checks require properties to be provided by a user, and otherwise will not run correctly. The general syntax is *-p "property_name=property_value".* > -c ExbSegmentationChecker -p "fsm=/path/to/corpus/corpus-utilities/segmentation.fsm" ExbSegmentationChecker looks for an external FSM to perform segmentation. The *property_name* in this example is *fsm*, the *property_value* is */path/to/corpus/corpus-utilities/segmentation.fsm*. > -u ComaMassLinkFiles -p "exb=true" -p "eaf=true" -f You may chain several properties in one call, same as with *-c/-u*. In the example above, ComaMassLinkFiles is being used to automatically link EXB and EAF files to their respective communications in the comafile. *-x*, *--xquery* Optional, specific to the class XQueryWrapper. Contains the query name. > -u XQueryWrapper -x pos Here Corpus Services will run the query named *pos* that counts parts of speech across several corpora.