I recommend you read through this section, then scan the commands in ALSCRIPT Command Summary to get a feel for what ALSCRIPT can do.
See Alternative ways of invoking ALSCRIPT. In this section, the interactive method is described. The QUICK START method shown in Alternative ways of invoking ALSCRIPT is useful to format a sequence alignment quickly using standard pointsize and shading.
ALSCRIPT is designed to work with AMPS block file format multiple alignments. If you have a multiple alignment generated by CLUSTAL V or the GCG package, then it must be translated to AMPS block file format.
To translate a GCG .MSF file: Type: msf2blc. To translate a CLUSTAL PIR format file, or any PIR format file: clus2blc.
Both programs prompt for the name of an input file, and an output block file name. A good convention to follow is to name all blockfiles with the extension ".blc".
To run ALSCRIPT simply type:
alscript
you will then be prompted for the name of the ALSCRIPT command file. Having typed the filename, the commands will be executed as you have specified.
A Simple Command File (example.als)
The file example.blc contains a small multiple sequence alignment. The following ALSCRIPT command file will convert this into a PostScript alignment file in 12 point Helvetica.
#Comments in ALSCRIPT command files start with a # # #Commands are free format - separated by blank, tab or comma characters # BLOCK_FILE example.blc #define the block file to format OUTPUT_FILE example.ps #where to put the result LANDSCAPE #landscape paper orientation POINTSIZE 12 #12 point default pointsize DEFINE_FONT 0 Helvetica DEFAULT #set font 0 to be Helvetica SETUP #Tell the program to get on with it.
Now try changing the POINTSIZE value to 5 ALSCRIPT will re-format the alignment to make best use of the available paper.
These are all STEP 1 commands - they refer to overall layout, and system settings - for example, the paper size or maximum sequence length. Other commonly used STEP 1 commands are IDENT_WIDTH which reserves more or less width for the sequence identifier codes, NUMBER_SEQS which adds a number to each sequence and LINE_WIDTH_FACTOR which allows the thickness of all boxing lines to be adjusted. See ALSCRIPT Command Summary for details of these and all other STEP 1 commands.
The simple example outlined above can be modified with a variety of STEP 2 commands.
for example file example2.als:
# FILE example2.als # #Commands are free format - separated by blank, tab or comma characters # BLOCK_FILE example.blc #define the block file to format OUTPUT_FILE example2.ps #where to put the result LANDSCAPE #landscape paper orientation POINTSIZE 12 #12 point default pointsize DEFINE_FONT 0 Helvetica DEFAULT #set font 0 to be Helvetica DEFINE_FONT 1 Helvetica REL 0.5 #set font 1 to be half sized Helvetica DEFINE_FONT 3 Helvetica-Bold DEFAULT #set font 3 to be Bold Helvetica DEFINE_FONT 4 Times-BoldItalic DEFAULT #font 4 is Times-BoldItalic NUMBER_SEQS #Number the sequences at left hand side SETUP #Tell the program to get on with it. # #step 2 commands come after the SETUP command # #Here are some examples... # SURROUND_CHARS GP ALL #draw lines around all G and P SHADE_CHARS ILVW ALL 0.6 #shade all I L V and W with value 0.6 BOX_REGION 1 1 2 20 0.8 #rectangular box from positions 1 to 2 of sequences 1 to 20 FONT_CHARS C ALL 3 #Use font 3 (BOLD Helvetica) for C characters ID_FONT ALL 1 #set identifiers in font 1
There are many possible ways of combining these commands and the others shown in ALSCRIPT Command Summary. In general, if you apply multiple commands to the same residue, the effect of the last applied command persists where there would otherwise be a conflict. Thus the intersection of two overlapping SHADed regions would be shaded according to the second SHADE command, not some mixture of the two. Similarly for FONT commands. BOX and SURROUND commands behave in the opposite sense, all BOXing and SURROUNDing persists regardless of how many commands you issue. This makes it possible for example, to SURROUND two different sets of residues as follows:
SURROUND_CHARS DE ALL SURROUND_CHARS DEHKR ALL
This would result in D and E characters being partitioned from the rest as well as D E H K R characters (see Example output).