===================================================================
HiTEC: accurate error correction in high-throughput sequencing data
Version 1.0.2: Feb.9, 2011
Copyright 2010 Lucian Ilie, Farideh Fazayeli, Silvana Ilie
===================================================================
HiTEC is a C++ program which corrects sequencing errors in high-throughput sequencing data, such as those generated by the Illumina Genome Analyzer.
Changes made compared to the former version:
-- Removing bugs
HiTEC is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
HiTEC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details: .
System Requirements
===================
HiTEC has been tested on systems running Linux on an x86_64 architecture. Compiling the program requires GSL, the GNU Scientific Library (http://www.gnu.org/software/gsl/)
How to install GSL
==================
1- Download the current stable version of GSL from ftp.gnu.org in the directory /pub/gnu/gsl. (currently we used gsl-1.9)
2- Install GSL
./configure --prefix=
make
make install
It will install GSL in specific path, . Please consult the INSTALL file Installation in gsl directory for more detailed instructions.
How to install HiTEC
====================
1- gunzip hitecXY.tar.gz (XY = 32 or 64)
tar -xvf hitecXY.tar
2- Modify makefile in HiTEC directory in order to set GSL path as installed in your PC
2.a) Set INCLUDEDIRSGSL as
INCLUDEDIRSGSL = -I/include
For example, we installed GSL Library in //work/ffazayel/brown/gsl-19 and
INCLUDEDIRSGSL = I//work/ffazayel/brown/gsl-19/include
2.b)Set LIBDIRS as
LIBDIRS = -L/lib
We did
LIBDIRS = -L//work/ffazayel/brown/gsl-19/lib
3- Compile HiTEC
make
It will make an exe file in the same directory as HiTEC.
How to run HiTEC
================
./hitec
Input Parameters:
- path and name of the input file containing the reads in FASTA or FASTQ format (http://en.wikipedia.org/wiki/FASTQ_format). In the current HiTEC version, reads must be of equal length and contain only the letters {A,C,G,T}.
- path and name of the output file containing the corrected reads in FASTA format.
- length of the genome; if real value is not known, then supply approximate value
- error rate x 100; if real value is not known, then supply approximate value
Sample
======
For example for the sample dataset it would be
./hitec S.aureusReads.fasta S.aureusCorrectedReads.fasta 2820462 1
=================================================
For any questions, please contact
Lucian Ilie
e-mail: ilie@csd.uwo.ca
=================================================