WORLD Sign In Sign Out

Help : Help Entering Results from Experiments


Contents


Description

The Experiment and Result Entry form is used to load microarray results into the database. To do this, you need an "unrestricted" account and an account on loader.stanford.edu. For more information regarding both the accounts and access required to enter results into the database, please refer to the Accounts and Access document.

Types of microarray data accepted.

SoftwareVersion
AgilentA.6.1.1, A.7.1.1 and A.7.1.2
AffymetrixMAS 5 (5.0, 5.1), GCOS (1.0, 1.1), and dChip 1.3
GenePix3.0, 4.1, and 5.0
BeadStudio3
NimbleScan2.1.16
ScanAlyzeAll
SpotReader1.0

First Time Users

Experiment entry is performed by software that reads through your data files and then inserts the appropriate information into the database. First, the software must be able to find your data files in a special directory in your loader account. This means that you cannot enter data without having a loader account and without having copied your data files onto loader. There are two directories in your loader account that are used for data entry: the "incoming" directory and the "ORA-OUT" directory.

The data files that you want to enter into the database must be in the incoming/ directory on your loader account. You can only copy files onto loader using SFTP (Secure FTP), as regular, non-secure FTP access is now blocked for security reasons. If you are unsure how to upload your data, see the Moving Files via SFTP section. Please note that the incoming directory on loader is cleaned out regularly, so enter data into the database soon after uploading your data files and verify your data in the database promptly. We also recommend that you personally save a copy of ALL files. You can SFTP your files from loader to your personal computer so you will have an archived copy of your original data.

The default location for the database output files (e.g. error files, search results, etc.) is a subdirectory on your loader account called ORA-OUT. This subdirectory will be created when your loader account is created. All logs and error reports will be placed in the ORA-OUT directory.

Please note that loader is a communal machine used by everyone who enters microarray results into the database. Storing files that are not related to the use of the database (even if on a "temporary" or "emergency" basis) can prevent other users to load legitimate data and is therefore grounds for having your account revoked.


What You Need

Depending on the feature extraction software that you used to generate your data, different files and information will be needed to load your experiment data.

For all types of feature extraction software, you need the following things before you enter an experiment into the database. The table below describes these items. All files submitted can be compressed or uncompressed. Compressed files have the normal suffix with .gz on the end; an example would be .dat.gz .

Experiment Details Required for Data Entry for all Feature Extraction Softwares

Item Unique Required Max # of characters Notes
Print name1   * N/A often the spotlist or Godlist name
Slide name *2 * 30 Usually a systematic name assigned by the slide printing facility
Experiment Name *2 * 100  
Green3 or Single4 Channel Description   * 100  
Red Channel Description3   * 100  
Experiment Description     2000 Unformatted text to describe experiment details
Category5   * 30 Choose from a list in the database
Subcategory5   * 30 Choose from a list in the database
Experiment Type6   * 40 Choose from a list in the database
Normalization Type7   * 30 Choose from a list in the database
Norm Value7   * 30 For normalization type "user-defined", a normalization value must be entered.
Result_Set_Name8 * * 100 The name of your result set
Result Set Description8     100 Free text description of your result set
Probe Set Algorithm4 *   100 Accepted values are: 'Affymetrix MAS 5' or 'dChip MBEIs', depending on what software was used to 'normalize' your data.
Table Notes

1 If you don't know which print to use for your experiment, after login, click Print List under "List Data" from the Main page or click "Print Name" on the individual experiment entry form. For further information, contact the microarray database curators.

2 Every slide name (e.g., array serial #) and experiment (hybridization) name must be unique - you may not re-use them, or use the same names as any other user. Result set names may be re-used, but only once per slide - you may have a result set called "simple normalization" for each of your slides, but only one per slide.

3 This field is required only for GenePix, ScanAlyze, or Agilent data.

4 This field is required only for Affymetrix data.

5 The database requires a category and subcategory for each experiment, both of which are chosen from lists stored in the database. Any category can be paired with any subcategory to describe an experiment. Categories, subcategories, and their descriptions can be found by clicking Category or SubCategory under "List Data" on the Main page, or by clicking Experiment Category or Experiment SubCategory on the individual experiment entry form. If you need a category or subcategory which is not already in the database, contact the microarray database curators.

6 This field is required for NimbleGen data, and is ignored for the others. Experiment types and their descriptions can be found by clicking Experiment types under "List Data" on the Main page. If you need an experiment type which is not already in the database, please send an email to the microarray database curators.

7 This field is always required, but it is ignored for Agilent and Affymetrix data.

8 These values are needed for Affymetrix and Agilent data only. See the note on unique slide, experiment, and result set names above2.

Data Files Required for GenePix, ScanAnlyze, SpotReader and Agilent Data entry

Item Suffix Max # of characters Notes
Data File1 .dat, .gpr, .srr, .txt 50 Please check the print dimensions within the database before gridding your array.2 For Agilent data, be sure that you have the text file and not the xml file.
Grid File .sag, .gps, .sra, .shp 50  
Green Scan File3,4 .tif 50 typically the 532nm scan
Red Scan File3,4 .tif 50 typically the 635nm scan
Table Notes

1 Do not change the default column names in the data file. SpotReader in particular gives you a choice of "channel shortcut names." Any of the default two-channel options are acceptable (Ch1/Ch2, Cy3/Cy5, Green/Red, 532/635). Any other names will cause experiment loading to fail.

2 Attempts to load array data not matching print dimensions (tips/blocks x rows x columns) are disallowed. If using GenePix, do not pre-filter any of the spot-features. Only a full gpr file, with an entry for every spot, in order, can be loaded.

3 Automatic .gif generation requires that you submit two .tif (not .scn) files when entering your experiment (one for each channel). The automatic .gif generation fails occasionally. However, if your .gif is not created at experiment entry, the microarray database curators can make it for you in most cases. It is also possible to upload a preferred .gif file if you don't like the generated ones.

4 For Agilent multiplex arrays, you can now submit the uncropped image of the full slide (containing the image of multiple hybridizations) instead of the cropped images for each hyb. In this case you can also use the unsplit (2-channel) image. For an 8-plex slide, e.g., do the analysis on the image of the complete slide and save the 8 result data files together with the unsplit image (1 file) and the .shp file (1 file). When you load this set into the database, for each result data file select the same tif image (doesn't matter if you select it in the red or green image box, but leave the other unchanged; if you already split the image, just use them as before) and the same .shp file.


Data Files Required for Affymetrix Data entry

Please note that the database accepts only gene expression data, from Affymetrix and dChip software (see below). Affymetrix mapping, resequencing and universal array data cannot be entered.

Item Suffix Max # of characters Notes
Data File .dat 50 The image file is generated by the Affymetrix chip scanning software. It is a 16-bit tiff file will be archived and converted to a 8-bit giff file for viewing in the database.
Cell File .cel 50 This file is generated from the .dat file by the Affymetrix MAS 5 or Affymetrix GeneChip Operating Software (GCOS). The native .cel file format for GCOS is a proprietary binary format. To upload GCOS .cel files into the database, open the GCOS Manager program and export the .cel file. This converts it into a text file the database can understand.
Gene File .txt, .xls 50 This file is generated from the .cel file by Affymetrix MAS 5, Affymetrix GeneChip Operating Software (GCOS), and dChip. To upload the Probe Set file into the database it needs to be exported from the analysis software as a tab-delimited text file, see Uploading a probe set file.
Experiment File .exp 50 This file is generated by the Affymetrix chip scanning software and contains chip protocol information.

Uploading a Probe Set File

  • Affymetrix GeneChip Operating Software (GCOS) Open the Probe Set intensity file (.CHP file) for a single chip. Select either the 'Pivot' tab or the 'Metrics' tab and save as a text file using the menu File -> Save as. If saving from the 'Pivot' tab, select all of the 'Statistical Absolute Result' columns from the menu Analyis -> Options -> Pivot tab. Enter 'Affymetrix MAS 5' for Probe Set Algorithm when loading the experiment.
  • Affymetrix MAS 5 Open the Probe Set intensity file (.CHP file) for a single chip. Select either the 'Pivot' tab or the 'Metrics' tab and save as a text file using the menu File -> Save as. Enter 'Affymetrix MAS 5' for Probe Set Algorithm when loading the experiment.
  • dChip Open one or more chips. Select menu Tools -> Export Data. Then select a single chip, an export file name, and absolute call and standard error columns to export the Probe Set intensity values. An example exported file is shown here. If you open it in Excel, remember to save it as tab-delimited text. Enter 'dChip MBEIs' for Probe Set Algorithm when loading the experiment.

  • Data Files Required for NimbleGen Data entry

    Please note that the database currently accepts only single channel data from NimbleGen.

    Item Suffix Max # of characters Notes
    Image File .tif 50 The image file should be a tiff file of the single channel scan. This file will be archived and a gif copy of it will be created for viewing in the database.
    Cell File .xys 50 This file contains processed result data for each feature on the slide. It is a tab-delimited text file.
    FTR File .ftr 50 This file is a tab-delimited text file that contains result information about features on the slides.
    Gene Intensity File .txt, .calls 50 This file is a tab-delimited text file that contains result information for genes.

    Data Files Required for Illumina Data entry

    Please note that the database currently accepts only single channel data from BeadStudio.

    Item Suffix Max # of characters Required Notes
    Image File .tif 50 No The image file should be a tiff file of the single channel scan. This file will be archived and a gif copy of it will be created for viewing in the database.
    Bead Chip File .idat 50 No This is a binary file written during scanning of the slide.
    Control Data File _control.txt 50 No This file is a tab-delimited text file that contains result information about control probes. The file may contain data for more than one hybridization. The column 'ProbeId' or 'Probe_Id' are required.
    Probe Data File _probe.txt 50 Yes This file is a tab-delimited text file that contains result information about sample probes. The file may contain data for more than one hybridization. The column 'ProbeId' or 'Probe_Id' are required.

    Moving Files via SFTP

    Files can now be transferred onto your loader account using SFTP only. For more information about SFTP please see the GSSG page on Secure Remote Shells and File Transfers. There are sftp clients available for Stanford affiliates for Mac and PC users. For detailed information on how to transfer files via sftp, please see the SFTP Help Page. Please note that you will connect to loader at your University, which is not necessarily the university displayed in the photos on the help page. If your files are older than 2-3 weeks, disable the 'preserve timestamp' option of the sftp client (you can do this on Fugu or WinSCP). Otherwise the files will be deleted the at night by the process that cleans up loader files.

    Result Entry Options

    On the Experiment and Result Entry form, you will need to select the following:

    1. Feature Extraction Software Package: Only one feature extraction software package can be chosen.
    2. Organism: Only one organism, associated with the array print, may be selected.
    3. Loading method: The results from array analysis may be loaded either individually or by batch. For specific assistance in loading results using one of these methods, please consult the links below.
      1. Entering Results for a Single Array
      2. Entering Results for a Batch of Arrays

    ©2007-2010 Broad Institute and Stanford School of Medicine | Sponsored by The Bill & Melinda Gates Foundation Contact | Access Policies