Naeflab
Fou++
français | english
Navigation
Ce wiki
Cette page
 The fou++ code (download) applies spectral analysis to tiling arrays times series data. Smoothing is performed by grouping probes in bins (in base pairs) of a fixed size (cf. the variable my$size below). The analysis can be run in two modes:
  • "F" for free, where the annoation is not used. Instead a sliding window is run across the whole chromosome and returns a score for each genomic position.
  • "A" uses annotation meaning that exons are treated as single blocks, i.e. all probes in any given exons are smoothed together. Intergenic or intronic regions are treated as in the "F" mode
./run.pl is wrapper that processes the sample data files (included the .tgz archive):
  • LLCrick.txt+A7.sample
  • LLWatson.txt+A7.sample
Note: these files contain the normalized raw data (cf. e.g. LLWatson_all_chrom.txt) to which annotation has been added in column 14-20

and produces output files:
  • LLWatson.txt.fou.A.V7.RW200
  • LLCrick.txt.fou.A.V7.RW200
Note: In this sample run, probes were pruned when the mean expression was below 3 and the standard deviation below 0.25. Cf. in the script below to see how this is specified in the command.

The columns contain (here the example of an intergenic probe)

1 6 probe number
2 all identifier (all for intergenic)
3 all dummy identifier
4 none         dummy identifier
5 ig secondary identifier (ig for intergenic)
6 500 position on the chromosome
7 3.332 mean of expression across the 12 time points
8 0.643 sd of expression
9 1 chromosome (negative if the probe was pruned)
10 5.000 number of probes smoothed at that position
11 0.010 F24 score
12 21.727 phase in hours
13 7.580e-01         p-value associated with the F24 score
14 N non-coding (N) or coding (C)
 
The columns contain (here the example of an exonic probe)
1 678 probe number
2 AT1G01070 identifier (here the gene name)
3 all dummy identifier
4 none         dummy identifier
5 tu6 secondary identifier (tu for transcription unit, intron for introns)
6 40660 position on the chromosome
7 2.745 mean of expression across the 12 time points
8 0.387 sd of expression
9 1 chromosome (negative if the probe is pruned)
10 3.000 number of probes smoothed at that position
11 0.066 F24 score
12 20.172 phase in hours
13 3.522e-01         p-value associated with the F24 score
14 C non-coding (N) or coding (C)
 

++++++++++++++++
this is the run.pl wrapper (included in fou_v1.tgz distribution)
#!/usr/bin/env perl
# runs the entire sequence of steps necessary to generate the cycling scores
use warnings;
use strict;
 
my$size=200;    #smoothing window       
my@chrs=(1,2,3,4,5);    #list of chromosomes
 
my$file="";;
 
my$type="A"; #"F"; type of analysis to be done "A" uses the annotation based binning, "F" ignores annotation
my@bases=("LLWatson","LLCrick"); #which files to process
 
# compile before we start
#`g++ fou.cc -o fou -ggdb`;
`make fou`;
 
foreach my$base (@bases) {
 
    # the original large data file
    my$orig = "$base.txt+A7.sample";
 
    my$slope="none";
    #my$slope="slopes/win$size.chr$chr.$type.exp";
 
    if(1){
        my$data = "$base.txt.fou.$type.V7.RW$size";
        # fou now processes all chromosomes at once
        if(1) {
            my$f24s_file = "$base.win$size.f24s";
            my$fou_command =
            # use this version for the mean<3, sd<0.25 cutoff
            "./fou 1 0 3.0 $size 0 $type 0 $slope 0.25 $orig 2> $f24s_file 1> $data";
            print "$fou_command\n";
            `$fou_command`;
        }
    }
}
 
Rechercher
Partager