TB Database: an integrated platform for tb drug discovery
WORLD Sign In Sign Out

Help : Help Entering Results for a Batch of Arrays


Contents


Description

Many times, you will have enough microarray results to enter that entering the file names one-by-one on a web page will be too time consuming. If this is the case, you can construct a file that provides all the information you would otherwise enter on the web page. The Data Entry for Microarray Experiment form is used to enter a batch of experiments into the database according to instructions specifed in the batch file. This help describes the information you need to enter an experiment and how to create the batch file. Small numbers of experiments can also be entered individually, and a separate help is available for that procedure.


It is assumed you've already read First Time Users and What You Need from the previous help file, Entering Results from Experiments

Assembling A Batch File

To load a group of experiments into the database, you first need to assemble a batch file in which each line contains all the information needed to enter one experiment. This is the same information that goes on the web form for individual experiment entry. The batch file must be tab delimited and have the lists below in the header. These lists are the same as the entry box titles on the individual experiment entry form. The columns in the batch file can be in any order.

Sample batch files from which you can copy headings are available here for GenePix/ScanAlyze/SpotReader, AffyMetrix, NimbleGen and Agilent type of data. Within the File menu of your browser, Just "Save As Text" and either copy or edit the resulting file. Your complete batch file must be in your loader account before you can enter the data.

Batch file Columns

As on the Individual Experiment Entry form, providing values for Experiment Date, Experiment Description, Collaborative Group, Individual User and IS_REVERSE is optional. You can either provide these columns in your batch file and leave them blank, or you can omit them from the batch file entirely; either will work. The Is_Reverse value defaults to 'N'. As in individual entry, Slide Name and Experiment Name must be unique. Unlike the individual form, however, where Print Name, Experiment Category and SubCategory, Normalization Types, and the various access selections are provided by selectable menus, you must know the exact values in advance, by consulting the Print, Category, Subcategory, user groups and user lists while putting together your batch file.

Normalization Type is required and can be "Computed", "Regression" or "User Defined". If Normalization Type is "User Defined", then Norm Value is required. Any number can be entered as a Norm Value. If norm Type is "Computed," then the default computed normalization is used. If the Normalization Type is either "Computed" or "Regression", the Norm Value column should be left blank. Normalization type is required for entering Agilent and Affymetrix data, but is ignored.

Experiment Date will default to the date the experiment is entered if the column is left blank. Two date formats are accepted. One is a 4-digit year, 2-digit month, and 2-digit day (YYYY-MM-DD, e.g. 2000-01-18), and the other is the Excel default (MM/DD/YY e.g.01/18/00).

Data files (.sag, .dat, .tif, etc.) can be identified in the batch file by filenames only if the batch file is in the same directory as all the data files. If some data files are organized in subdirectories inside incoming/, then the batch file should include the path to those data files relative to the batch file. If, eg. some of the data files are in the "worm_aging" directory inside the incoming/ directory, the path would be: "worm_aging/1234.gpr".

Running the Batch Load Program

  1. Initiating the Process: Clicking the "Load Experiment(s) using a Batch File" button on the Experiment and Result Entry form takes you to the batch entry interface, where you can specify the feature extraction software your used and enter the name of your batch file.

  2. Verify the Batch File: You should check your batch file before loading. To do this, after initiating this process, enter your batch file location (UNIX file path) and click "Check Batch File". This will run a check of your batch file, and provide feedback via your browser. You can then correct the errors within your batch file (re-uploading it, if editing on your PC) then re-submit it. Typical errors include malformed files (must be tab-delimited text), incorrect names (print names, categories, subcategories, and experimenter must exist in the database), incorrect data-file path locations, and non-unique columns (slidenames, experiment names).

  3. Submit your Request: All that remains is to click the 'Load Experiments' button, and the loading program scans your batch file one last time for incorrect entries, and then enters the request into a queue.

Monitoring Your Request as it Progresses Within the Queue

Experiment loading is commenced by entering your loading-data into a queue. The rate of loading is determined by a number of factors, including both the load on the database and how many other array-load requests were made prior to yours. If there are no delays, it usually takes at least five minutes per array, and may take quite a bit longer if your arrays have a large number of spots (human arrays) or if many other users are using the database. During this time, you can check the progress of your experiment load within the queue. After your data is successfully entered into the queue (Note: this is not the same thing as final entry into the database), you should receive a confirmation screen as well as an email notifying you:

	Your database entry request (batch number XXXX) has been
queued for loading.

	Please note the data for your array(s) ARE NOT YET IN THE
DATABASE.  Do NOT delete any of your files until you receive email
confirmation that the data have been loaded.

	Progress of is batch within the queue can be viewed at:

http://genome-www5.stanford.edu/cgi-bin/tools/queue/nph-ProgressQuery?batchno=XXXX

	You may also view this file in your loader account under the ORA-OUT directory.
	If you have any questions please contact the curators 

	array@genome.stanford.edu)                                                     

You can check the progress of your experiment load based on the batch number reported to you with either the link on the queue confirmation page or from the URL in the email.


Successful Result Entry into the Database

If all goes well, you will eventually get an email message that says:


        Loading of your array data (batch number XXXX), has
successfully completed. 1 out of 1 were successfully loaded.

	Details of the load process have been written to :
        
	/loader/ftphome/username/ORA-OUT/XXXX.log,

or you can temporarily view the details via the web at:

http://genome-www5.stanford.edu/cgi-bin/tools/queue/nph-ProgressQuery?batchno=XXXX

	If you have any questions please contact the curators

	(array@genome.stanford.edu).

At the bottom of the HTML confirmation page or in the log file in your ORA-OUT directory on loader.stanford.edu should be the message:
***All data for this experiment ('slidename') have been successfully inserted into oracle database***

Common Problems

  1. If your results have not been loaded 1 day after entry into the queue, please notify the microarray database curators.

  2. File location: All files must be in the incoming directory on your account for loader.stanford.edu.

  3. UNIX file names: The names of your uploaded files should not contain spaces, or any of the following characters:
    '  "  #  ,  /  \  ?  <  >  ;  :  !  @  %  ^  &  *  (  )

  4. Occasionally, we backup and re-index the database. This process can significantly the delay the loading of data (and vice versa). We suggest not loading during these time periods. Consult the Scheduled Database Backups page for the times to avoid.

  5. Sometimes .tif -> .gif conversion fails. Please check your loaded arrays by displaying them and verifying the clickable-gif. If you need to replace the gif file that we have created for you, please see our help documentation for this. If their is no clickable-gif icon present, contact the microarray database curators.

  6. Errors? What errors? Shortly after a queue batch request is processed (successful or not), you will no longer be able monitor its status within the queue (as it has been removed, and its web-log with it). However, just check your ORA-OUT directory on loader.stanford.edu to see the text log-file of the database entry.


Please send comments or questions to: array@genome.stanford.edu