TB Database: an integrated platform for tb drug discovery
WORLD Sign In Sign Out

Help : Platesample Entry Help

Contents

Description

The creation of a print within the database is a complicated process, but is absolutely required prior to experiment entry. To get your data into the database, there are a number of things we need, the most important being the godlist. A godlist is a list of plate samples (well address + contents), in the order the plates were put in the printer. This file is a tab-delimited text file, most likely exported from a spreadsheet application. The requirements we make of this file are the following:

Before you start...

Not every new print requires a godlist submission. A new godlist should only be submitted if you are certain that
  1. The plates used during printing have not been previously entered into the database, or
  2. The plate(s) were entered in the past, but their contents have changed over time (well contamination, well emptied), and are therefore considered novel.
For example, if your lab makes 3 different prints using the exact same plates (perhaps in different orders) it is not necessary to compile separate godlists for all three. All the curators require to enter the subsequent new prints (2) would be a 2-column platelist, comprised of database plateIDs and plate names from the first print (in their new order). The curators can assist setting up your "master" plate list, from which all future prints will likely be composed. New plates can be entered as needed, as described below.

Column Headers

Required Columns
PLATThe plate number; eg 1, 2, 3, etc.
PROWThe plate row; eg A, B, C, etc.
PCOLThe plate column; eg 1, 2, 3, etc.
NAMEThe sequence name; usually a systematic name or clone identifier (see CLONEID, below). This is the only name used for samples of TYPE other than CDNA.
TYPEThe sequence type; usually ORF, CDNA, CONTROL, or EMPTY. List of TYPEs | Download SUID/TYPE Examples
FAIL
Whether the PCR (sample verification) failed;
0 : one distinct band - success
1 : no signal - fail
2 : multiple distinct bands
3 : signal, but not a distinct band (smear)
4 : multiple smears
5 : unknown
101 : worst cases of peeled away or haloed spots(assigned on a 96 well plate basis)
102 : less bad cases of peeled away or haloed spots(assigned on a 96 well plate basis)
Null is assumed to be 0 (success)
CLONEIDRequired for samples of TYPE=CDNA, if ACC is absent/null. Real cDNA clones must have a cloneID. otherwise it is assumed the sample is a psuedoclone, which requires an surrogate accession.. Format examples: IMAGE:34049, ATCC:183963
ACCRequired for samples of TYPE=CDNA, if CLONEID is absent/null. This is the GenBank accession, usually acquired from dbEST. Used to populate the clone and clone_gbacc tables
Optional Columns
DESCA description of the molecular entity, if desired. This desctiption is associated with the SUID itself (not a clone or platesample description)
LUIDLaboratory Unique ID: For those samples that have identical NAME and TYPE, but require distinction within the laboratory for experimental reasons (different sources, questionable quality, sample tracking).
GENE_NAMESometimes clones will stop being included in UniGene for spurious reasons, but users have a 'Preferred Name' for those clones. The Gene_name column will be entered into the preferred_name column of the clone table, for a new clone SUID.
ORIGINFor CDNA clones, this can indicate whether they are public or private.
SOURCE A string describing the source of the clone or DNA. This has typically been used to indicate the original plate source, and the 96 and 384 well plate locations that a clone has been in, eg GF200:96(1A1):384(1A1). In this case GF200 refers to a set of resgen plates, aka the 1st 5K. This field can be used by any type of DNA.
IS_CONTWhether the sample is known to be contaminated. A blank entry will default to unknown (U)
IS_VERWhether the DNA in a well has been verified. A blank entry will default to no (N).
SAMPLE_DESCA description, if any, about that particular sample. This description is specific to the plate sample.
ORGANISMIf submitting a print containing samples from multiple organisms (i.e. human, yeast). For those few rows where the sample is derived from an organism *other* than the default (user-defined), the organism code must be specified. For a list of 2-letter organism codes, go here

RULE 1: Required column headers are: PLAT, PCOL, PROW, NAME, TYPE, FAIL, and CLONEID, ACC (if TYPE=CDNA). If any of these headers are either misspelled or absent OR if any data is null (except FAIL/ACC/CLONEID columns), you cannot proceed with godlist submission. In addition, PLAT, PCOL, and PROW ordering must be correct and no wells may be skipped (with the exception of the last plate in the print run). Empty wells must be specified as such, except for the tail-end of the last plate (also see common errors). Optional columns: DESC, LUID, CLID, GENE_NAME, ORIGIN, SOURCE, IS_CONT, IS_VER, SAMPLE_DESC, CLONE_DESC, ORGANISM.

Names

SMD uses the combination of NAME, TYPE, and ORGANISM to uniquely identify a sample. Each unique combination is given a numeric Stanford Unique ID, also called SUID. SUIDs allow comparison of the same samples accross different prints. Thus, it is extremely important to insure that erroneous SUIDs are not created. Erroneous SUIDs are usually created by a bad NAME (either misspelled, non-standard, or non-systematic). Every new sample must be verified and committed to the database (via SUID) before the godlist/print can be entered. Therefore, all rows in your godlist file are checked to see if the combination of NAME, TYPE, and ORGANISM has been used previously. If not, these samples (rows) must have new SUIDs assigned to them if they are verified by the user to be legitimate, new samples.

RULE 2: If any any samples within your godlist are not currently in the database, you will be prompted to double-check your entries prior to passing the intermediate file off to the curators. Please be a concientious user and verify that any new SUIDs you approve are valid. Erroneous SUIDs prevent comparisons between prints/experiments!

Making a print - the relationship between the godlist and the final print

If your godlist passes the first two rules (above), you now must specify how the samples were printed in order to de-convolute the SPOTLIST. In order to do this, you must know how many sectors (corresponding to printer tips), columns, and rows your print output, and therefore, the gridding methodology, corresponds to. You will have to provide these 3 values, and they must "equate" (relatively) to the number of rows in your godlist (sans header)

RULE 3: This equation must be followed:
#samples = (#godlist rows-1) <= #tips * #rows * #columns = #spots
Therefore, you are not permitted to have more godlist samples than gridded spots (experiment loading would be disallowed). Conversely, if your #spots (gridded) is greater than #samples (godlist rows) realize that the data from the empty, "ghost" spots on the partial last row will be disposed of during experiment loading (allowed).

Common Errors to Avoid

There is a program to assist you in godlist submission, which follows the rules stipulated above. In order to run it, you must place (or FTP) the godlist in your ORA-OUT directory within your home directory. This is so the program can read your original file, offer feedback, and write intermediate steps.


Please send comments or questions to: array@genome.stanford.edu