Naeflab
SLM Documentation
français | english
Navigation
Ce wiki
Cette page
SLM User's Guide




Introduction


This manual assumes that SLM has already been installed.
For instructions on how to install SLM, see the FAQ section, or check the README in the installation package.
This manual explains how to run SLM and use the different options included. This is a preliminary version and some sections are not complete yet. However, there should be enough here to get you started with SLM.




Quick Start

To use SLM, you will have to know the directory where SLM has been installed.
To simplify your analyses we suggest that you add the SLM directory to your path, assuming that it has been installed in the /usr/local/SLM, 

   %> setenv PATH /usr/local/SLM:$PATH
for tcsh and csh, or

   %> export PATH=/usr/local/SLM:$PATH
for bash or sh.

One of the first commands you might run is

   %> SLM

to find out the exact version and configuration of SLM you are working with.




Running SLM

To run the standard analysis with SLM, use the following

   %> SLM inputfile outputfile

Input Data

SLM takes a text data file as input.
This has to be a TAB-delimited text file whose fields vary according to the selected options. See option -K.

Output Data

SLM outputs a TAB-delimited table. If no output filename is specified, SLM will use standard output.
The fields vary according to the selected options. See option -P.

Options

To run SLM with any of the options below, use the following

   %> SLM [options] inputfile outputfile

The available options are:

  -B <int>
specifies the background subtraction method to be used. The implemented methods are:
       
    0 skips the background subtraction (not recommended if the data has not been background subtracted)
    1 subtracts a RMA-like estimator, which models nucleotides affinities with second order legendre polynomials for all probes (see Reference.). It is possible to exclude some probes from the estimation by setting a cutoff on the intensity distribution (see option -I)
    2 subtracts a constant optical background, chosen as the 3rd percentile of the overall intensity distribution. Before subtraction, all probes below the fifth percentile are linearly scaled between the fifth and the third percentile. (see Supplementary material of Reference.)
    3 subtracts a local optical background, chosen as the 3rd percentile of the intensity distribution within a window (see option -W). Before subtraction, all probes in the window below the fifth percentile are linearly scaled between the fifth and the third percentile. (see Supplementary material of Reference.)
    4 subtracts the mismatch probe if available.

  -N <int>
specifies the normalization method to be used. The implemented methods are:
       
    0 skips the normalization (not recommended if the data has not been normalized).
    1 normalizes the probe intensities using a quantile normalization, and imposing a final value of 100 for the median
    2 normalizes applying a global scaling factor such that the final median of each sample is equal to 100
    3 normalizes applying a local scaling factor such that the median in local windows (see option -W) for each sample is equal to 100

  -E <int>
SLM estmates the log-enrichment between experiments and controls assuming it to be uniform in a window (see option -S) around the probe of interest. The variance of the estimator is also computed. This option allows to choose different smoothing kernels and also different methods to estimate the variance (see also option -U)
For the details, check Reference.
The implemented methods are:
       
    0 calculates the log-enrichment using a uniform smoothing kernel in a window around the probe of interest. The variance is estimated using the theoretical formula.
    1 calculates the log-enrichment using a Gaussian smoothing kernel around the probe of interest. To adjust the width of the kernel, see option -S. The variance is estimated using the theoretical formula.
    2 calculates the log-enrichment using a uniform smoothing kernel in a window around the probe of interest. The variance is estimated using a bootstrap approach. (requires longer running time)
    3 calculates the log-enrichment using a uniform smoothing kernel in a window around the probe of interest. The variance is estimated using a bootstrap approach. (requires longer running time)

  -P <int>
Printing option. SLM can output processed data at any step to be easily inserted into any analysis pipeline.
The implemented methods are:
       
    0 Prints a TAB-delimited table with three columns:
column 1: genomic position of the mean of the fitted peak contour
column 2: t-score (or t-like score, see option -U)  of the mean of the fitted peak contour
column 3: standard deviation in base pairs of the fitted peak contour
    1 Prints a TAB-delimited table with eight columns:
column 1: genomic position of the mean of the fitted peak contour
column 2: error on the position of the mean of the fitted peak contour
column 3: t-score (or t-like score, see option -U)  of the mean of the fitted peak contour
column 3: error on the t-score (or t-like score, see option -U)  of the mean of the fitted peak contour
column 5: standard deviation in base pairs of the fitted peak contour
column 6: error on the standard deviation in base pairs of the fitted peak contour
column 7: number of probes covered by the smoothing kernel
column 8: maximum log-enrichment in the window used by the peak-detection algorithm
    2 Prints a TAB-delimited table with four columns:
column 1: genomic position of the probe of interest
column 2: number of effective probes: sum of the weights as assigned by the smoothing kernel
column 3: log-enrichment (beta) for the probe of interest
column 4: variance of the log-enrichhment estimator
    3 Prints a TAB-delimited table.
column 1: genomic position of the probe
then, the columns are the normalized control samples, followed by the normalized experiment samples
    4 Prints a TAB-delimited table with three columns:
column 1: genomic position of the probe
then, the columns are the background subtracted control samples, followed by the background subtracted experiment samples

  -S <int>
Width of the smoothing kernel in base pairs.
(see option -E)

  -W <int>
Width of the sliding window in base pairs.
(see option -B3, and option -N3)

  -I <double>
Sets the higher cutoff for each sample separately on the probes intensity in order to use a probe for RMA-like background estimation.
If >1, then it is considered to be an intensity measure. If 0<I<1, it is assumed to be a percentile on the raw intensities distribution.
(see option -B1)

  -U <int>
If 1, then the unbiased estimator of the log-enrichment variance is computed. If 0, the biased estimator is used instead.

  -C <int>
Specifies the number of controls in the data. If not specified, SLM assumes that half of the data columns are controls.

  -T <int>
If 1, then the first column of the input file is assumed to be a 25-mer. If 0 then no probe tag is given, and the first column is the genomic position of the probe.
It is ignored if the options -K is set to anything else than 0.

  -L <int>
If 1, then the input data is in logarithmic scale. If 0 then it is assumed to be in linear scale.

  -K <int> SLM can read appropriate input data at any stage of the analysis pipeline to be easily inserted into an automated framework.
The implemented methods are
       
 
0 reads raw data.
column 1: oligo sequence of 25 nucleotides used for the PM probe.
column 2: genomic position of the probe
then, the columns of the raw intensities of the control samples, followed by the raw intensities of the experiment samples
    1 reads background subtracted data (e.g. data obtained by using the option -P4) in the form of a TAB-delimited table:
column 1: genomic position of the probe
then, the columns are the background subtracted control samples, followed by the background subtracted experiment samples
    2 reads normalized data (e.g. data obtained by using the option -P3) in the form of a TAB-delimited table:
column 1: genomic position of the probe
then, the columns are the normalized control samples, followed by the normalized experiment samples
    3 reads probe log-enrichments (e.g. data obtained by using the option -P2) in the form of a TAB-delimited table:
column 1: genomic position of the probe of interest
column 2: number of effective probes: sum of the weights as assigned by the smoothing kernel
column 3: log-enrichment (beta) for the probe of interest
column 4: variance of the log-enrichhment estimator

 

Examples

To run SLM with standard option settings, use

%> SLM inputfile outputfile

The outputfile will contain the list of peaks detected from the inputfile using a RMA-like background subtraction, quantile normalization, Gaussian kernel with unbiased estimation of the enrichment variance. The Kernel has size equal to 200 base pairs.
This is equivalent to

%> SLM -B1 -N1 -E1 -P1 -S200 -I0 -U1 -T1 -L0 -K0 inputfile outputfile




References
  • "Identifying synergistic regulation involving c-Myc and sp1 in human tissues" - F. Parisi, P. Wirapati and F. Naef - NAR 2007

last updated - 2006-11-20

Rechercher
Partager