Data format

The NormalyzerDE input format consists of two matrices: the design matrix and the data matrix. The design matrix defines the samples which should be found as column headers in the data matrix. It also needs to have a column which defines sample groups and could have an additional column with batch or other blocking factors. Columns in the data matrix that are not defined as samples will be retained and output as in the original matrix after processing. If you have data in the legacy Normalyzer format, they can be converted from the 'Convert legacy tab'. Upload your legacy matrix to automatically generate a sample matrix and design matrix compatible with the NormalyzerDE input format. If your table contains rows with leading comments (lines starting with '#'), you can use the 'Proteios' input format which trims away the comments and performs the analysis on the resulting matrix.

Design matrix
            sample  group  other
            s1_1    1      batch1
            s1_2    1      batch1
            s1_3    1      batch2
            s2_1    2      batch2
            s2_2    2      batch2
            s2_3    2      batch1
        

Please note that dashes ('-') are not allowed in the group column.

Data matrix
            pep  RT   s1_1 s1_2 s1_3 s2_1 s2_2 s2_3
            ATCA 40.3 20.1 20.2 19.5 22.3 22.5 21.9
            GCC  43.0 20.3 19.8 19.5 21.7 21.9 21.2
        

Contrasts

The contrasts option (available for differential expression) specify which groups specified in the design matrix that should be compared. For instance, if specifying the contrast 1-2 in the example above statistics will be calculated where samples s1_1, s1_2 and s1_3 (group 1) will be compared to s2_1, s2_2 and s2_3 (group 2). No spaces are allowed in the group names.

Multiple contrasts can be specified divided by a comma (and no spaces): 1-2,1-3. Note that this requires that samples belong to group 1, 2 and 3 must be specified in the design matrix.

Running retention-time based normalizations

NormalyzerDE looks for a column containing the free-standing word RT (for instance "RT" or "RT values" but not "RTVals"). The normalization is then performed within ranges of retention time to avoid time-based bias. The algorithm and its purpose is described here under the header "RT-Segmented Normalization".


Additional information

For detailed instructions on how to interpret the NormalyzerDE evaluation graphs instructions are found in the documentation for Normalyzer.

For information on how to interpret p-value histograms we recommend reading the following article at Variance Explained. Briefly, the distribution is expected to be flat if no effect is present and with a sharp peak at zero levelling out into a flat distribution if there is an effect present. If other effects are seen, it requires closer investigation.