LSC manual

  •  
  • Manual

    Installation

    No explicit installation is required for LSC. You may copy the LSC binaries to any location as long as all the binaries (including Novoalign) are in the same directory or path.

    But you need to Python2.6 installed in your computer. The modules "numpy" and "scipy" are also required. Please see runLSC.py run.cfg for more details

    Using LSC

    Firstly, see the tutorial on how to use LSC on some example data.

    In order to use LSC on your own data:

    1. Create an empty directory, this will be the working directory.
    2. Copy "run.cfg" from the LSC package to the working directory.
    3. Edit run.cfg to include paths to your data files and the paths of the temp folder and the output folder. You may also want to configure the default settings.
    4. Execute "/home/user/LSC_path/runLSC.py run.cfg" while in your working directory. or Execute "runLSC.py run.cfg, if all LSC executable files are in the default bin
    5. After a certain time execution will conclude. You can find results in the "output" directory.

    Module: runLSC

    "runLSC.py" is the main program in the LSC package. It calls the other modules to run the full error correction on your data. Output is written to the "output" folder. Details of the output are described in file formats. Its options are described run.cfg. Please just need to run "runLSC.py" with a configuration file "run.cfg":

    /home/user/LSC_path/runLSC.py run.cfg
    run.cfg
    the .cfg file which defines the run parameters. For details, see .cfg format

    Example:

    runLSC.py run.cfg
    or
    /home/user/LSC_path/runLSC.py run.cfg

    where "/home/user/LSC_path/" is the path of your LSC package. If you have put all LSC executable files in the default path, then you just need to run the first example.

    Output files

    There are three output files: corrected_LR_SR.map.fa, full_LR_SR.map.fa, uncorrected_LR_SR.map.fa in output folder:

    The quality (error rate) of the corrected reads in corrected_LR_SR.map.fa depends on its SR coverage. In principle, their error rates are lower than the corresponding raw reads. If you want to select "best" reads for your downstream analysis, you can:

    1. Map to the reference genome or annotation (for RNA-seq analysis). Then, filter the reads by mapping score or percentage of base match (e.g. "identity" in BLAT)

    2. If there is no reference genome or annotation, then map the short reads back to the output (corrected_LR_SR.map.fa or full_LR_SR.map.fa). Select the one with high SR coverage.

    Configuring Novoalign

    You need two Novoalign modules: novoalign and novoindex. We recommend the version "Novoalign V2.07.10"

    Note that Novoalign is proprietary software, so we cannot distribute it with LSC. However, if you are licensed to use Novoalign, contact us and we can help email a copy to you.

    Execution Time

    The following execution times are guesstimates based on the running times on our servers with eigth thread. These figures will greatly differ based on your system configuration.

    This speed should be faster than similar tools.