=================================================================== HiTEC: accurate error correction in high-throughput sequencing data Version 1.0.2: Feb.9, 2011 Copyright 2010 Lucian Ilie, Farideh Fazayeli, Silvana Ilie =================================================================== HiTEC is a C++ program which corrects sequencing errors in high-throughput sequencing data, such as those generated by the Illumina Genome Analyzer. Changes made compared to the former version: -- Removing bugs HiTEC is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. HiTEC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details: . System Requirements =================== HiTEC has been tested on systems running Linux on an x86_64 architecture. Compiling the program requires GSL, the GNU Scientific Library (http://www.gnu.org/software/gsl/) How to install GSL ================== 1- Download the current stable version of GSL from ftp.gnu.org in the directory /pub/gnu/gsl. (currently we used gsl-1.9) 2- Install GSL ./configure --prefix= make make install It will install GSL in specific path, . Please consult the INSTALL file Installation in gsl directory for more detailed instructions. How to install HiTEC ==================== 1- gunzip hitecXY.tar.gz (XY = 32 or 64) tar -xvf hitecXY.tar 2- Modify makefile in HiTEC directory in order to set GSL path as installed in your PC 2.a) Set INCLUDEDIRSGSL as INCLUDEDIRSGSL = -I/include For example, we installed GSL Library in //work/ffazayel/brown/gsl-19 and INCLUDEDIRSGSL = I//work/ffazayel/brown/gsl-19/include 2.b)Set LIBDIRS as LIBDIRS = -L/lib We did LIBDIRS = -L//work/ffazayel/brown/gsl-19/lib 3- Compile HiTEC make It will make an exe file in the same directory as HiTEC. How to run HiTEC ================ ./hitec Input Parameters: - path and name of the input file containing the reads in FASTA or FASTQ format (http://en.wikipedia.org/wiki/FASTQ_format). In the current HiTEC version, reads must be of equal length and contain only the letters {A,C,G,T}. - path and name of the output file containing the corrected reads in FASTA format. - length of the genome; if real value is not known, then supply approximate value - error rate x 100; if real value is not known, then supply approximate value Sample ====== For example for the sample dataset it would be ./hitec S.aureusReads.fasta S.aureusCorrectedReads.fasta 2820462 1 ================================================= For any questions, please contact Lucian Ilie e-mail: ilie@csd.uwo.ca =================================================