O for Morons

Tutorial version 1.1

O for Morons - a Beginner's Guide

Written by Gerard J. Kleywegt

Department of Molecular Biology

University of Uppsala

Uppsala, Sweden

E-mail: "gerard@xray.bmc.uu.se"

(c) 1994 - G.J. Kleywegt

Version 0.1 @ 931222 - lay-out of the document

Version 0.2 @ 931223 - chapters 1 and 2; part of 10.3

Version 0.3 @ 931227 - chapters 3, 4, 5 and 6

Version 0.4 @ 931228 - chapters 7 and 8; more macros

Version 0.5 @ 931229 - first, almost complete version

Version 0.6 @ 931230 - appendix 10.2 and 10.6; put into MacWriteII

Version 0.7 @ 931231 - checked spelling; cleaned up

Version 0.8 @ 940102 - "beta-test version"

Version 0.9 @ 940114 - changes, corrections etc.

Version 1.0 @ 940115 - this-will-have-to-do version

Version 1.1 @ 940211 - changes after Protein Engineering course

0.0 - Contents

Section Contents Page

0.0 Contents 3

0.1 Preamble 7

0.2 Notes for instructors 8

1.0 Getting started 10

1.1 Let's gO ! 11

1.3 Well, well, what's all this then ? 15

1.4 My first protein structure ! 16

1.5 Connecting the dots 19

1.6 Shake, rattle 'n' roll 20

1.7 Question time ! 22

2.0 Properties, structure and paint 24

2.1 More detail 25

2.2 Structure 26

2.3 Painting 28

2.4 It's the Spanish Inquisition ! 31

3.0 What's on the menu today ? 34

3.1 What's a menu ? 35

3.2 Fast food 35

3.3 A la carte 37

3.4 The customiser is king 40

3.5 Connecting the dots differently 43

3.6 What's up, Doc ? 44

4.0 Again and again and again ... 46

4.1 Recipes 47

4.2 What's the ID ? 49

4.3 What if ? 50

4.4 Symbolically speaking 50

4.5 Unix speaking 51

4.6 Do-it-yourself ! 51

4.7 Inter-course fun 53

4.8 Tell me why 53

5.0 What a super position ! 54

5.1 All lipocalins are equal ... 55

5.2 Match-makers 56

5.3 Operator, what's the number ? 59

5.5 How did you do ? 64

6.0 Superficial voids 68

6.1 Speleology 69

6.2 Surfaces 69

6.3 More superficiality 71

6.4 What now ? 72

7.0 Get out the yard-stick ! 74

7.1 What's your angle ? 75

7.2 Neighbours 76

7.3 Interactive 76

7.4 The main chain stays mainly in the plane 78

7.5 Making flippy floppy 80

7.6 Walk on the wild side chain 82

8.0 Teenage, mutant, Ninja proteins 88

8.1 I think I'm having a fit ! 89

8.2 More mutations 91

8.3 Cleaning up 92

8.4 Insertions 95

8.5 Fine-tuning 97

8.6 Reconstructing a structure 98

8.7 One more question, Your Honour 100

9.0 Fancy pictures 102

9.1 Sketch it ! 103

9.2 On the table 104

9.3 Selections 106

9.5 Nice pictures 108

10.0 Appendices 110

10.1 Index of O commands discussed in this tutorial 110

10.2 Inverted index 113

10.3.1 Why does mutate_insert/replace distort my molecule ? 115

10.3.2 Is there an easy way to select fancy colours in O ? 115

10.3.3 Why do the dials move so fast on my SGI ? 115

10.3.4 What is the best way to backup a molecule ? 116

10.3.5 Is there a split-screen stereo, and if so, how do I access it ? 116

10.3.6 When I try to display a map, I get an error condition #43. 116

10.3.7 How do I reset the parameters for major menu X ? 116

10.3.8 When I display my CA trace, some of the bonds are missing. 116

10.3.9 Can I display an RNA backbone ? 116

10.3.10 How can I centre on a particular spot in space ? 116

10.3.11 How do I mutate_insert after the last residue ? 117

10.3.12 How do I mutate_insert before the first residue ? 117

10.3.13 Why does RSC give crazy values for some residue types ? 118

10.3.14 Why does O ignore some of the commands in my macro ? 118

10.3.15 Why doesn't O draw all bonds in my ligand ? 118

10.3.16 Does O work with nucleic acids ? 118

10.3.17 How can I give my waters/ligands a different chain-id ? 118

10.3.18 How can I get H-bonds between protein and ligand ? 119

10.3.19 Can I get a LOG file from O ? 119

10.3.20 What does this INST error mean ? 119

10.3.21 Can I ID the atom at the active centre from a macro ? 119

10.3.22 How should I contour cavities and surfaces ? 119

10.3.23 How can I connect the two S-gamma atoms in a disulfide ? 119

10.3.24 Why doesn't sphere_atom work in my ODL file ? 119

10.4 Macros 121

10.4.1 date.omac 121

10.4.2 colour_code.omac 121

10.4.3 edb.omac 121

10.4.4 all_on_off.omac 121

10.4.6 acid_base.omac 122

10.4.7 cnos_colours.omac 122

10.4.8 ball_and_stick.omac 12210.4.9 set_prefs.omac 123

10.4.10 paint_restype.omac 123

10.4.11 save_view.omac 123

10.4.12 yasspa.omac 123

10.4.13 rainbow.omac 124

10.4.14 nice_residue_colours.omac 124

10.4.15 sketch_setup.omac 125

10.5 Other O commands 126

10.6 Selected datablocks 128

10.6.1 .message_template 128

10.6.2 .id_template 128

10.6.3 .molec_obj_integer 128

10.6.4 .molec_obj_real 128

10.6.5 .o-version 129

10.6.6 .active_centre 129

10.6.8 .moving_atom 129

10.6.9 .colour_names 129

10.6.10 .error_messages 130

10.6.11 .dial_real 130

10.6.12 .timestamp 130

10.6.13 .symbols 130

10.6.15 .gs_real 131

10.6.16 .active_colour 131

10.6.18 .trig_real 131

10.6.19 .torsion_information 131

10.6.20 file_o_save 132

10.6.21 file_o_backup 132

10.6.22 .solid_hbonds 132

10.6.23 .lsq_integer 132

10.6.24 .refi_dict 132

0.1 - Preamble

This document was written for use in the "Computers, Graphics & Databases in Molecular Biology" module of the Protein Engineering course of Winter 1994. This document assumes use of O release 5.9.2 on an SGI/Unix workstation.

O is a macromolecular crystallographic modelling program. It can be used to look at biomacromolecular structures, to analyse them, to compare them, to modify them and to build them from scratch (using crystallographic data). The program is described in:

* T.A. Jones, J.Y. Zou, S.W. Cowan & M. Kjeldgaard, "Improved Methods for the Building of Protein Models in Electron Density Maps and the Location of Errors in these Models", Acta Crystallographica, A47 (1991) 110-119

* T.A. Jones & M. Kjeldgaard, "O -- the manual", 161 pp., Uppsala (1993)

Some of the algorithms implemented in O are described in earlier papers, referenced in both documents. Others are described in these documents themselves.

This tutorial is no substitute for the O Manual; they are complementary documents. The O Manual explains every command in turn; this tutorial (hopefully) guides you through a subset of the O commands in a more or less logical order. Also, the tutorial will not teach you how to use O in a biomacromolecular crystallography context, i.e. you will not learn how to build a protein structure from an MIR map (we may write a tutorial for that later).

What you WILL learn is how to draw, analyse and compare protein structures. Extension to other types of molecule (RNA, DNA, ...) is not entirely trivial, but should not be impossible once you master the basics of O.

The tutorial contains nine instructional chapters and an appendix. Working through chapter one should take about an hour; chapters 2, 3, 4, 6 and 9 take ~2 to 3 hours and chapters 5, 7 and 8 take ~3 to 4 hours. Some questions which test subject matter to which you have just been introduced have been included both throughout and at the end of each chapter. Try to answer these (there's not always necessarily one correct answer) and write down your answers, so your assistant can check them.

0.2 - Notes for instructors

In the Uppsala Protein Engineering course, the first three days are reserved for an introduction to computers (Unix), graphics (O) and databases (PDB, nucleotide sequences) in Molecular Biology. In the 1994 course, we have room for 32 students using 8 SGI Indy's. The students are divided into 16 pairs, 8 of which are working in the graphics lab at any given time (the others have other assignments); once a day the groups switch.

The program for this course is as follows (practical O1 is chapter 1 of the tutorial, O2 is chapters 2 and 3, and O3 is chapters 3 and 4; the other chapters are done later on in the course):

Day 1 a.m.: four thirty-minute talks about the following subjects:

* Introduction to Unix

* Introduction to O

* Databases in Molecular Biology

* The Protein Data Bank

Day 1 p.m.: two-hour practical (Unix exercise and O1) for each group of eight pairs

Day 2 a.m.:

* four-hour practical (O2) for eight pairs

* two 90-minute database practicals for two groups of two pairs

Day 2 p.m.:

* four-hour practical (O2) for eight pairs

* two 90-minute database practicals for two groups of two pairs

Day 3 a.m.:

* four-hour practical (PDB exercise and O3) for eight pairs

* two 90-minute database practicals for two groups of two pairs

Day 3 p.m.:

* four-hour practical (PDB exercise and O3) for eight pairs

* two 90-minute database practicals for two groups of two pairs

We work with four teaching assistants (TAs) for every O practical; another TA takes care of all database practicals (using Macintoshes with access to the Internet). The coordinator is present during all practicals and can jump in where needed.

The students that take the course are not expected to know anything at all about computers, Unix, graphics, O, proteins, crystallography, NMR, etc. etc. This is the reason why some basic principles of protein structure have to be introduced briefly in this part of the course. If this tutorial is used by protein crystallographers who are new to O, they will probably want to avoid drawing a partial Ramachandran plot by hand ...

Students need to have a work directory; they should be supplied with the coordinates of chain A of P2 myelin protein (a very poor and unrefined model !) and, separately, with those of the fatty-acid ligand inside chain A. Students should have paths (alternative: aliases) to reach the C-shell scripts "run" and "ono". Executables of VOIDOO, MOLEMAN, MAPMAN, ODBMAN and O2D are needed for some exercises. Also, the students must be able to run GhostScript or GhostView, or some other PostScript viewer (printing PostScript files is optional).

gerard@xray.bmc.uu.se

1 - Getting started

In this chapter you will learn how to run the O program, how to create a new O database file, how to import a molecule into the program/database, how to display a molecule and how to save your database.

New O commands:

Backup_DB

Ca_zone

Centre_ID

Centre_Xyz

Centre_atom

Centre_zone

Directory

End_object

Molecule_nam

Object_name

On_off

Sam_atom_in

Sam_list_seq

Save_DB

Stop

*

1.1 - Let's gO !

During the course, you can run the O program by typing "ono" at the Unix prompt:

unix > ono

Throughout this tutorial, lines starting with " unix > " indicate that you are in the Unix environment, where you have to use Unix commands. Lines starting with " O > " indicate that you are within O, which means that you have to type valid O commands. Things that you have to type in are shown in bold typeface. The output has sometimes been edited to conserve space; this is usually indicated by a series of three dots (...) on a line by itself. Occasionally, action you have to take with the mouse, function keys or dials will be indicated in brackets, for instance: { click "On_off" }. Note that each chapter starts with an ultra-brief description of what you will learn, plus a list of new O commands that you will be taught how to use. There is white space next to these commands which you are encouraged to use for making your own notes (for instance, the syntax of the command, or an evaluation of how useful the command is, or a description in your own words of what the command actually does) !

The first time you start the program (make sure that you are in your work directory when you do this !), two new directories will be created automatically:

unix > ono

... Run 4d_ono

... Making a soft link to the odat directory for you

... Making a soft link to the omac directory for you

... Executing /nfs/taj/alwyn/o/bin/4d_ono

... For gerard on rigel at Thu Dec 23 15:37:56 MET 1993

The first one, called "odat", contains lots of files which all users need often. The other one, "omac", contains lots of useful O files which have been made by other O users.

Since this is the first time that you use O, you have to put some general information into the database. You do this by giving O the names of two general files (which reside in the "odat" directory):

* menu.o a file which contains the names of all O commands

* startup.o a file which contains lots of general information

Once you have told O the names of these files, it asks you again for a file name. Since you don't have to supply any more files at this stage, just hit the key marked "Enter" (also known as the "Return" or "Carriage Return" or "Linefeed" key): O > Use of this program implies acceptance of conditions

O > described in Appendix 1 of the O manual

O > O version 5.9.2 , Tue Nov 23 12:20:13 MET 1993

O > Define an O file (terminate with blank): menu.o

O > ...file is formatted

O > menu.o file for O version 5.9

O > Define an O file (terminate with blank): startup.o

O > ...file is formatted

O > startup.o file for O version 5.9

O > Define an O file (terminate with blank):

Now O finds out that it misses an important file, which defines which atoms are connected (and, therefore, which bonds have to be drawn in pictures of proteins that you are going to make). O says "Enter file name", and then suggests a file name which is enclosed in [square brackets]. This is a general mechanism in O: often when the program asks you a question, it will suggest an answer which is printed in between square brackets. Such a value (it can be a file name, but also a number, or a series of numbers) is called a "default (value)". If you want to use this default, all you have to do is to hit the Return key. In this case, the default for the file name may be assumed to be okay, so accept it:

O > Enter file name [ /nfs/taj/alwyn/o/data/all.dat]:

O > Maximum inter-residue link distance = 2.00

O > There were 23 residues.

O > 175 atoms.

This particular file contains information about 23 amino-acid residue types. If you use this file to display a protein structure, all bonded atoms will get lines drawn between them. Inter-residue peptide bonds (between the carbonyl carbon of a residue i and the amide nitrogen of residue i+1) will be drawn if the distance between these two atoms is less than 2.0 Å.

The final thing that O wants to know is if you want to use the display. The default answer is "Yes"; only on rare occasions will your actual answer be "No". In this case, accept the default:

O > Do you want to use the display? [Yes]:

O > Graphics board GL4DXG-4.0

O > Making visibility data structures.

O > O > Trackball on (F7KEY)

Oops-a-daisy: all of a sudden you have an extra window on the screen ! This window is black and contains some information about the version of O that you are using. All the drawings that you are going to make will appear in this window. It is therefore called the "graphics window". The window in which you have been typing is called the "terminal window". In the terminal window, you see " O > ". This is called a "prompt"; it reminds you that you are actually "talking" to O, rather than to Unix. If the prompt doesn't appear, hit the Return key.

You probably do not want to type all the file names again the next time you use the program. And later, when your database contains protein atom coordinates etc., you will want O to "remember" all this information in between sessions. However, the database exists only in the memory of the program while the program is active; as soon as the program is stopped, all information is lost, unless you have stored it on a disk in a file.

This is what you are going to do now. The first O command that you will learn is called "Save_db" (note the special character "_" in between the words "Save" and "db"; where is this character on your keyboard ?). The fact that this is the first command that you learn does indeed mean that this is the most important command ! If you type this command, O will make a "carbon copy" of your current database in a real file on a real disk in your current directory. The first time that you use this command, O needs to know what you want to call this file. You may call the file anything you like, but you are advised to use a short, meaningful name, and to add ".o" to that name, so that when you list the contents of your directory you will easily remember that this is an O database file. Do yourself a favour and do NOT use any special characters (*, %, , #, spaces, etc.) in the names of any files; just use a-z, A-Z, 0-9 and the dot (.), minus (-) and underscore (_) signs: O > save_db As1> File_O_save is not defined. As1> Enter file name [ binary.o]: p2.o The second command that you will learn is called "Backup_db". It does exactly the same as "Save_db", but to a different file. Sometimes, software or hardware problems may occur which lead to corruption of your database file (the one created with the "Save_db" command). In that case, it's good to have another copy of this file with a different name: O > backup_db As3> File_O_Backup is not defined. As3> Enter file name [ backup.o]: p2_backup.o Since the database is a sort of file system, you can list its contents. The command to do this is called "Directory". When you type this command, O wants to know which datablocks ("files") you want to have listed. In this case there is no default value indicated in square brackets, but if you just hit the Return key, you will get a listing of all of them. You may also type an asterisk (*) which O takes to mean "all datablocks" (the asterisk is called a "wildcard"). O > directory Heap> Which param blocks : Heap> .MENU_MAJOR_NAME C W 76 Heap> .MENU_MINOR_NAME C W 758 ... Heap> .OBJECT_OBJ_DISP_ATOMS T W 720 Heap> .OBJECT_VIS_DISP_ATOMS I W 20 Heap> 159 data blocks used, space for 2000 Heap> 1438 integer/real units used, space for 1000000 Heap> 1282 character units used, space for 200000 Heap> 23820 text units used, space for 500000 Each datablock ("file") has a name (e.g., ".MENU_MAJOR_NAME"), a type (I, R, C or T), a read/write-flag (R or W) and a size. The type can be Integer (0, 1, 2, -6342 and 853526 are examples of integer numbers), Real (0.0, 3.14 or -9823.87, for example), Character (which may be any text of up to 6 characters) or Text (which may be any text of up to 72 characters). The read/write-flag is R for datablocks which will not be saved when you save your database to a file (temporary datablocks); it is W for all your normal datablocks which will be saved. The datablock size of R and I datablocks tells you how many numbers a datablock contains. For example, the datablock ".MENU_INTEGER" contains three integer numbers, and ".MENU_REAL" contains two real numbers. For C-type datablocks, the size is the number of six-character strings in the datablock. For T-type datablocks, the size is the number of characters in the datablock plus one. The datablock names may contain up to 25 characters (no spaces); they are printed by O in UPPERCASE, but O recognises them irrespective of how you type them. In other words, ".TIMESTAMP" and ".timestamp" are identical datablocks. This is a general feature of O: commands, parameters etc. may be typed in lowercase, UPPERCASE or MiXeD cAsE. There is one exception, namely names of files. The reason for this is that Unix DOES consider "p2.o", "p2.O" and "P2.o" and "P2.O" to be four DIFFERENT names. At the bottom of the directory list, you see how many datablocks there are in your current database (and for how many there is room inside the computer's memory). Now stop the program by issuing the command "Stop": O > stop As1> Saved As1> Graphics released. Note that O automatically saves your database when you stop the program. Check that the database files exist in your directory: unix > ls -FartCos 1 lrwxrwxrwx 1 gerard 16 Dec 23 15:41 omac 1 lrwxrwxrwx 1 gerard 21 Dec 23 15:41 odat 84 -rw-r--r-- 1 gerard 42794 Dec 23 16:14 p2_backup.o 84 -rw-r--r-- 1 gerard 42794 Dec 23 16:16 p2.o 1.3 - Well, well, what's all this then ? The next time that you use O (i.e., now), just go to your work directory and type "ono", followed by the name of the file to which you saved your O database: unix > ono p2.o ... Run 4d_ono ... Executing /nfs/taj/alwyn/o/bin/4d_ono p2.o ... For gerard on rigel at Thu Dec 23 16:18:03 MET 1993 O > Use of this program implies acceptance of conditions O > described in Appendix 1 of the O manual O > O version 5.9.2 , Tue Nov 23 12:20:13 MET 1993 O > Loading p2.o O > Maximum inter-residue link distance = 2.00 O > There were 23 residues. O > 175 atoms. O > Do you want to use the display? [Yes]: O > Graphics board GL4DXG-4.0 O > Making visibility data structures. O > Making visibility data structures. O > O > Trackball on (F7KEY) Now O doesn't need to read "menu.o" etc. again; instead, it asks you immediately if you want to use the display (i.e., the graphics window). Check that the database contains the same number of datablocks as it did the last time you used O: O > dir ; Heap> .MENU_MAJOR_NAME C W 76 ... Heap> .OBJECT_OBJ_DISP_ATOMS T W 720 Heap> .OBJECT_VIS_DISP_ATOMS I W 20 Heap> 159 data blocks used, space for 2000 Heap> 1438 integer/real units used, space for 1000000 Heap> 1282 character units used, space for 200000 Heap> 23820 text units used, space for 500000 Hey, wait a minute ! Shouldn't you type "Directory", then hit the Return key, wait for O to ask us which datablocks to list and then hit Return again ? The answer is: you may, but: * first, you usually don't have to type the full names of O commands (see below) * second, you may provide parameters to commands (i.e., answers to questions that O asks you when you execute a command) on the same line as the command. This is often done in this tutorial in order to save space. You, however, are encouraged to find out what the parameters for the O commands that you will learn are. You can find this out by checking the O Manual, or by simply typing only the command, hitting the Return key and letting O prompt you for the values of the parameters. * third, the semi-colon (;) is a special character in O: it means: "rather than me waiting for O asking me to provide a value for this parameter, I accept the default that O will come up with, whatever it may be" About abbreviating commands: in O you only have to type the part of a command that makes it unique, i.e. such that it cannot be confused with the name of another command which starts with the same letter(s). Let's take the "Directory" command: all of the following command abbreviations are recognised by O as meaning "Directory": directory, director, directo, direct, direc, dire, dir However, "di" is not unique, since there are several other O commands whose names begin with the letters "di": O > di O > DI is not a unique keyword. O > Directory is a possibility. O > Dial_box is a possibility. O > Dial_previou is a possibility. O > Dial_next is a possibility. O > Dist_define is a possibility. O > DI is not a visible command. 1.4 - My first protein structure ! If you do not have a file called "p2a.pdb" in your work directory, ask your assistant for help. This file contains the coordinates of all non-hydrogen atoms of a protein called P2 myelin protein. This protein contains 131 amino-acid residues and a total of 1038 non-hydrogen atoms. The file is in so-called PDB format (have a look at its contents if you like, but don't change anything in it !). Before you can draw this structure, you have to store the coordinates and names of the atoms in your O database. This can be done with the "Sam_atom_in" command: O > s_a_i Sam> Name of input file: p2a.pdb Sam> O associated molecule name: p2a ... Sam> Molecule P2A contained 131 residues and 1038 atoms Note how the command name has been abbreviated ! O wants to know what the name of the coordinate file is (no default value). Also, you have to give the molecule a name (no more than 5 characters). This is necessary, since you may have coordinates of ten or fifty different proteins in your database; in that case, you have to be able to tell O of which structure you want to make a drawing, etc. Let's see if the structure exists in the database: O > dir * Heap> .MENU_MAJOR_NAME C W 76 ... Heap> P2A_ATOM_XYZ R W 3114 Heap> P2A_ATOM_B R W 1038 Heap> P2A_ATOM_WT R W 1038 Heap> P2A_ATOM_Z I W 1038 Heap> P2A_ATOM_VISIBLE I W 1038 Heap> P2A_ATOM_SELECT I W 1038 Heap> P2A_RESIDUE_NAME C W 131 Heap> P2A_RESIDUE_TYPE C W 131 Heap> P2A_ATOM_NAME C W 1038 Heap> P2A_RESIDUE_POINTERS I W 262 Heap> P2A_RESIDUE_CG R W 524 Heap> P2A_CELL R W 6 Heap> P2A_SPACEGROUP T W 13 Heap> P2A_PDB_SCALE R W 12 Heap> P2A_DATE T W 25 Heap> 170 data blocks used, space for 2000 Heap> 10506 integer/real units used, space for 1000000 Heap> 2582 character units used, space for 200000 Heap> 22418 text units used, space for 500000 Perhaps you expected just one datablock ("p2a.pdb") ? However, contrary to Unix, O knows a thing or two about proteins (thanks to the files you gave it when you first created your database !). What O has done is to extract information about P2 myelin protein from the PDB file and stored it in three types of datablocks: * datablocks which contain information about the protein as a whole (e.g., "P2A_SPACEGROUP") * datablocks which contain one item of information for each residue in the protein (e.g., "P2A_RESIDUE_TYPE") * datablocks which contain information about the individual atoms in the protein (e.g., "P2A_ATOM_XYZ") Note that the names of the datablocks have structure: * they all start with "P2A_"; remember that you called this molecule "p2a" ! * residue information is stored in datablocks whose names begin with "P2A_RESIDUE_" * atomic information is stored in datablocks whose names begin with "P2A_ATOM_" This structure makes it easy to use the "Directory" command and get only a list of datablocks related to your P2A molecule: O > dir p2a_* Heap> P2A_ATOM_XYZ R W 3114 Heap> P2A_ATOM_B R W 1038 Heap> P2A_ATOM_WT R W 1038 Heap> P2A_ATOM_Z I W 1038 Heap> P2A_ATOM_VISIBLE I W 1038 Heap> P2A_ATOM_SELECT I W 1038 Heap> P2A_RESIDUE_NAME C W 131 Heap> P2A_RESIDUE_TYPE C W 131 Heap> P2A_ATOM_NAME C W 1038 Heap> P2A_RESIDUE_POINTERS I W 262 Heap> P2A_RESIDUE_CG R W 524 Heap> P2A_CELL R W 6 Heap> P2A_SPACEGROUP T W 13 Heap> P2A_PDB_SCALE R W 12 Heap> P2A_DATE T W 25 The parameter "p2a_*" means: all datablocks whose names begin with "p2a_". Similarly, to get a list of all residue-related datablocks of molecule P2A, use: O > dir p2a_resid* Heap> P2A_RESIDUE_NAME C W 131 Heap> P2A_RESIDUE_TYPE C W 131 Heap> P2A_RESIDUE_POINTERS I W 262 Heap> P2A_RESIDUE_CG R W 524 And to get a list of all molecules in your database for which you have atomic coordinates, use: O > dir *xyz Heap> ALPHA_ATOM_XYZ R W 105 Heap> BETA_ATOM_XYZ R W 120 Heap> DI_ATOM_XYZ R W 30 Heap> P2A_ATOM_XYZ R W 3114 Oops - you never loaded molecules called "ALPHA", "BETA" or "DI" !? These three are special "mini-molecules" whose use will not be discussed now. Just remember to never call one of your own molecules "ALPHA", "BETA" or "DI" ! Let's see what kind of information some of the datablocks of your P2A molecule contain: P2A_ATOM_XYZ = Cartesian X, Y and Z coordinates P2A_ATOM_B = Atomic temperature factors P2A_ATOM_Z = Atomic numbers (e.g., 6 for carbon atoms) P2A_ATOM_NAME = Names of the atoms (e.g., N, CA, CD1) P2A_RESIDUE_NAME = Names of the residues (e.g., 1, 2, B3) P2A_RESIDUE_TYPE = Types of the residues (e.g., Ala, Trp) Now, save and backup your database so that you won't have to read the molecule in again the next time that you use O. 1.5 - Connecting the dots You are now ready to draw the structure of P2 myelin protein. This requires the following steps: Tell O which molecule you want to use. This is done with the command "Molecule_name" (usually abbreviated "mol"): O > mol O > Current molecule has not been loaded. Mol> Molecule code name []: p2a In O, you can have many, many drawings in the graphics window. Each of these is called an "object", and, just like molecules, objects must have a name (so you can refer to them, for example when you want to change their colour, or when you want to delete one). The command to define such a name is called "Object_name" (often abbreviated "obj"). Names of objects may not contain more than 6 characters: O > obj Mol> Name of the new object [P2A ]: first Now you have to start drawing things, for example, using the "Ca_zone" command (usually abbreviated "ca"). This command will draw lines between C-alpha atoms in neighbouring residues (provided their distance is not too large). O asks you which residues should be included in the drawing. Just accept the default ("all residues") for the moment: O > ca Mol> Ca zone [all molecule]: Is there something wrong ? There's nothing on the screen ! Don't worry; O doesn't actually draw objects until you tell it that you have entered all the drawing commands for this particular object. You do this by issuing the "End_object" (abbreviated "end") command: O > end Still nothing ! But, there's one change: in the top right corner of the graphics window you can see the following: On_off ^FIRST This list of "words" is called the O "menu". Right now, there's not much there: an O command called "On_off" and -surprise !- the name of the graphics object that you just created: "FIRST", although it has a "caret" (^) in front of it. This means that the object really exists. The reason you can't see it is that you are somewhere in space -initially at (0,0,0)- and the protein is somewhere else. To fix that, type the following command: O > centre_xyz 50 65 33 Yeah ! Bingo ! You have the drawing at the centre of the graphics window ! 1.6 - Shake, rattle 'n' roll Of course, looking at a static picture of a protein is not very interesting (you don't need expensive computers to do just that). You want to rotate, translate, zoom in, etc. In O, there are three different ways to do this: * using the dials: you have eight dials on the "dial box"; in the bottom left corner of the graphics window you see what each dial is supposed to do. Verify this and try to explain what the "slab" dial does (first zoom in; then turn the slab dial counter-clockwise). If the rotations are too fast, type: "db_set_data .dial_real 1 1 0.15" (don't worry about the meaning of this command). * using the mouse: RIGHTMOUSE = xyz rotation RIGHTMOUSE + SHIFTKEY = x/y translation RIGHTMOUSE + SHIFTKEY + MIDDLEMOUSE = z translation RIGHTMOUSE + MIDDLEMOUSE = zoom RIGHTMOUSE + LEFTMOUSE = slab * using a pseudo-dial box on the screen: first press the F6 key on your keyboard while the cursor is in the graphics window. You'll see the outlines of a new window; move it somewhere outside your graphics window and press the left mouse button. Now move the cursor to the box marked "Rot Y" and put it on top of the letter "Y". Press the left mouse button and keep it pressed down while you slowly move the mouse to the left and to the right. Play around with each of these methods and use the mechanism that you are most comfortable with. Can you get a view perpendicular to the axis of a helix ? To get rid of the pseudo-dials, press the F6 key again (while the cursor is in the graphics window). To switch off the mouse control, press the F7 key. Type "centre_xyz 50 65 33" again. Now the middle of your molecule is at the centre of the graphics screen again. This is because the point with coordinates (50, 65, 33) (Å) is close to the centre of the molecule. However, if you enter an unknown molecule into your database, you will not know where its centre is. The following, general, procedure can then be used. Use the command "Sam_list_seq" to find out what the names of the residues in the molecule are: O > s_l_s Sam> Molecule name [P2A ]: Sam> Name Type From To Centre Radius Sam> A1 SER 1 6 50.17 57.02 19.38 2.19 Sam> A2 ASN 7 14 46.61 56.25 17.61 2.89 ... Sam> A130 LYS 1023 1031 38.01 55.56 27.25 3.64 Sam> A131 VAL 1032 1038 33.84 57.47 27.11 2.33 In this case, you have 131 residues with NAMES A1, A2, ... A131. Now you can use another command, "Centre_zone" to put the centre of gravity of the molecule at the centre of the screen. A "zone" is an important concept in O: it defines a stretch of one or more consecutive residues in one molecule. You define it by giving O the molecule name, followed by the names of the first and the last residue in the zone: O > ce_zo p2a a1 a131 As4> P2A A1 A131 FIRST As4> Centering on zone from A1 to A131 If you want to centre on a specific residue, type its name twice (or use ;): O > ce_zo p2a a1 ; As4> No object defined. As4> P2A A1 A1 FIRST As4> Centering on zone from A1 to A1 If you want to centre on the C-alpha atom of a certain residue, use the "Centre_atom" command: O > ce_at p2a a131 If you want to centre on another atom than the C-alpha of a certain residue, use the same command, but add the atom name: O > ce_at p2a a34 n But, you may ask, how do you know which residues and atoms are which ? This is done by "picking" (a.k.a. "ID-ing"). This means that you move the cursor to one of the atoms in the picture and then press the left or middle mouse button. If you do this, you will see: * a label appearing next to the picked atom (e.g., "A3 CA"; these are the name of the residue and the name of the atom, respectively) * a text at the top of the graphics screen which tells you more about the atom; for example: P2A A3 Lys CA , xyz = 44.97 54.24 20.97 ; B = 20.0 ; Z = 6 ; Which two C-alpha atoms are not connected, even though they are in neighbouring residues ? Why are they not connected ? There's another centre command which is useful in cases like these: "Centre_id". If you type this command, O expects you to pick an atom and it will centre on that atom. Use this command to centre on one of the two C-alpha atoms that are not connected in order to find out which residues they belong to. Click on the text "^FIRST" in the graphics window with the left or middle mouse button. What happens ? Now type "^first" in the terminal window. What happens ? This demonstrates yet another important principle in O (about which you will learn more later): commands or parameters or even text (molecule names, for example) can be entered with the keyboard, or they can be put on the menu and then clicked. 1.7 - Question time ! (1) Explain the difference between working in Unix and working in O. (2) Which of the following commands can you use in Unix, and which in O (and what do they do): ls, stop, rm, directory, cp, ca_zone, save_db, cat, centre_atom, jot ? (3) Explain, define or describe the following concepts: default, caret, zoom, prompt, graphics window, wildcard, mouse, cursor, molecule, object, uppercase, underscore, asterisk, pseudo-dials. (4) What is the difference between "real" and "integer" numbers ? (5) What information about datablocks is listed with the "directory" command ? (6) What is the minimal abbreviation of the following commands: save_db, sam_atom_in, ca_zone, centre_xyz, stop ? (7) What are the full names and the parameters of the following commands: s_a_i, ce_zo, mol, obj, ce_xyz ? (8) Suppose you have a protein structure in a file called "1guh.pdb". Which commands do you have to type (including their parameters) in order to read the structure into your database, to draw a C-alpha trace of it and to centre on the middle of the molecule ? (9) Explain the difference between the "mol" and the "obj" command. (10) Explain the concept of a "zone" as defined in O. (11) Which commands can be used to centre on a particular point in space ? (12) Move the cursor to your graphics window, type "dir *resid*" and hit the Return key. Explain your observations. (13) Suppose that someone told you that there is a command called "CPK_object" in O. How can you verify that this command exists ? Describe what the command does and what its parameters are. Your notes: 2.0 - Properties, structure and paint In this chapter you will learn about properties of molecules, residues and individual atoms, and how to use these for painting your residues. New O commands: Clear_ID Cover_Sphere Delete_objec Paint_case Paint_colour Paint_obj_at Paint_obj_zo Paint_object Paint_proper Paint_ramp Paint_zone Read_formatt Sphere_centr Write_format YASSPA Zone Your notes: 2.1 - More detail Start O again. The first thing you should notice is that the object that you created last time is still in the graphics window. O stores the drawing instructions in the database and these are therefore kept in between sessions. The object that you have, "FIRST", is a simple C-alpha trace. If you want to see side chains, you may use the "Zone" command. Type "Zone" followed by the names of the first and the last residue of the zone of residues whose sidechains you want to see: O > mol p2a O > obj bit O > zone a5 a14 O > zone a103 a109 O > zone a131 ; O > z a1 a1 O > end Note that the graphics object called "bit" contains several zones of residues and two individual residues. Also note that different chemical elements are drawn in different colours. What is the colour of oxygen atoms ? And of nitrogen atoms ? Switch the two objects on the screen off and on. What is the name of the tryptophan residue that you have drawn ? Switch all objects on. Centre on the N-epsilon atom of the arginine residue that you have drawn. Often you will want to draw residues that are close to a certain atom or residue. If you want to draw, say, all residues that are within 5 Å of the N-epsilon atom of the current arginine, you may use the "Cover_sphere" command. This command can be given in two different ways: * cover_sphere residue_name radius * cover_sphere residue_name atom_name radius O > obj cover O > cov_sph a106 ne 5 O > end Experiment with the different ways of issuing the "Cover_sphere" command. Which residues are drawn if you use a 5 Å radius around the N-epsilon atom ? Sometimes, you don't want to draw residues close to a particular atom, but rather those close to the current centre of the screen. In that case, use the "Sphere_centre" command; the only parameter of this command is the desired radius in Å. Centre on the middle of the molecule and use this command to draw all residues within 8 Å of this point: O > ce_zo p2a a1 a131 As4> P2A A1 A131 SPH As4> Centering on zone from A1 to A131 O > obj sph O > sphere 8 end Which aromatic residues are drawn now ? Note that in the last line above, you typed "sphere 8 end". In other words, you typed TWO O commands on one line. This is perfectly valid ! You may even issue three, four or more commands on a single line. Now make an object called "TEST" which contains all residues that are within 6 Å from the C-alpha atom of the N-terminal residue; type all commands on a single line. How many leucines are drawn ? By now, you probably have been clicking on quite a few atoms. The command "Clear_id" can be used to remove the labels from the graphics screen. Try this. You may now also remove all objects except "FIRST". To do this, use the "Delete_object" command: O > de Mol> Objects = FIRST BIT COVER SPH TEST Mol> Object name ( <CR> = exit ) : test Mol> Objects = FIRST BIT COVER SPH Mol> Object name ( <CR> = exit ) : O > del cover sph Mol> Objects = FIRST BIT Mol> Object name ( <CR> = exit ) : O > del bit ; Compare the different ways of using this command in this example. 2.2 - Structure It's time to do something non-trivial which is relevant for proteins. O contains a command that will figure out where in your protein the alpha helices and the beta strands are. It is called "YASSPA" ("Yet Another Secondary Structure Prediction Algorithm"). You have to execute this command twice, once to find the helices and once to get the strands. This command uses two of the three "mini-molecules" that you encountered earlier (which two, do you think ?). For each residue, O considers the two neighbouring residues on both sides as well, and uses their C-alpha coordinates to decide if the central residues are in a helix or a strand. Just repeat the commands that follow: O > yasspa Util> Molecule name ([P2A ]): Util> Template molecule name ([alpha]): Util> Cuttoff ([0.5Ang]): Util> Template size : 5 residues. Util> There were 17 O > yasspa p2a beta 0.8 Util> Template size : 5 residues. Util> There were 75 Well, that's not very informative ... P2 myelin protein apparently contains 17 residues in helices and 75 in strands. But, hey, didn't O store a lot of information in the database ? Check if there are any new datablocks related to molecule P2A: Heap> P2A_ATOM_XYZ R W 3114 Heap> P2A_ATOM_B R W 1038 ... Heap> P2A_DATE T W 25 Heap> P2A_MOLECULE_TYPE C W 2 Heap> P2A_MOLECULE_CA C W 1 Heap> P2A_MOLECULE_CA_MXDST R W 1 Heap> P2A_ATOM_COLOUR I W 1038 Heap> P2A_RESIDUE_2RY_STRUC C W 131 In fact, there are five new ones (which ?). But the one created by "YASSPA" must be "P2A_RESIDUE_2RY_STRUC". How do you access the information in it ? The first method is by looking at the contents of the datablock. This can be done with the "Write_formatted" command. The command has three parameters: * the name of the datablock to be written * the name of the file to which it should be written, OR a semi-colon (;) which means: "write to the terminal window" * the format (don't worry about this; just use a semi-colon) O > wr P2A_RESIDUE_2RY_STRUC ;; P2A_RESIDUE_2RY_STRUC C 131 (1x,5a) BETA BETA BETA BETA BETA BETA ... BETA BETA BETA BETA Apparently, YASSPA has written the word "ALPHA" for every residue in a helix and "BETA" for residues in strands. What about the other residues ? There is one place where a helix is followed immediately by a strand. This is not very likely to be true (it just shows that YASSPA isn't perfect). Therefore, write the datablock to a file and change the secondary structure assignment for the neighbouring ALPHA/BETA residues to "nothing", i.e. spaces, by editing the file (do this in Unix). Once you have edited the file, you have to get it back into O. To do this, use the "Read_formatted" command (the only parameter of this command is the name of the file). After that, verify that the datablock has indeed been changed and read correctly by typing its contents to the terminal window again. Now you have a sort of list of the secondary structure types of the residues in P2 myelin protein. But you may like a more graphical representation, for example a drawing where each residue is coloured according to its type (e.g., helix red, strand green, rest yellow). In the next section you will learn how to do this.2.3 - Painting Colour is an extremely powerful means of conveying information in molecular drawings. Within O there are dozens of predefined (and zillions of user-definable) colours available. The command "Paint_colour" can be used to select a colour, or to get a list of all predefined colours (by using a question mark, ?, as parameter): O > paint_colour ? Paint> Available colors: Paint> aquamarine black blue Paint> blue_violet brown cadet_blue Paint> coral cornflower_blue cyan Paint> dark_green dark_olive_green dark_orchid Paint> dark_slate_blue dark_slate_gray dark_slate_grey ... Paint> thistle turquoise violet Paint> violet_red wheat white Paint> yellow yellow_green Paint> Error condition [Colour name not in database] in askcol You should still have the "FIRST" object on the graphics screen (if not, draw it again). If you want to paint this object in a nice colour, e.g. dark_slate_blue, then first select this colour with the "Paint_colour" command. Next, use the "Paint_object" command and either type the name of the object, or click on an atom in the object that you want to colour: O > pai_col Paint> Colour? [orange]: dark_slate_blue O > pai_object first Paint> FIRST If you have some time to spare, and if you are curious to find out what all these predefined colours are, then type the following commands (you will learn later what the second one means): O > read omac/colour_demo.odb O > @col Cute, eller hur ? Select a colour that appeals to you and use it to paint your "FIRST" object. Delete the "ALL_COL" object and centre on your molecule again. Whenever you want to see the colours again, just type "@col". Instead of colouring an entire object, you may also paint a zone, a residue or even a single atom inside an object. Use the commands "Paint_obj_zone" and "Paint_obj_atom" to do this. For both commands you may either type the parameters in the terminal window, or you may pick one or two atoms to identify the zone/residue/atom: O > pa_col medium_forest_green O > pai_obj_zo p2a a1 a15 first Paint> P2A A1 A15 FIRST O > pa_col slate_blue pai_obj_zo p2a a131 a131 first Paint> P2A A131 A131 FIRST O > pa_co red pa_obj_at p2a a1 ca first Paint> P2A A1 CA FIRST Note that up until now, you have been colouring OBJECTS, but you haven't changed the colours of the atoms in the database. In O, each atom has a colour associated with it (which datablock contains the atom colours of your P2 myelin protein molecule, do you think ?). The default is to colour carbons yellow, oxygens red, nitrogens blue, sulphurs green etc. In addition, each molecular graphics object has colour information associated with it. So even though you coloured the C-alpha atom of the N-terminal residue red in the "FIRST" object, this atom still has the colour yellow associated with it in the P2A molecule in the database ! Verify this by drawing a zone containing only this residue. It is important that you realise the difference between colouring an object (zone/atom) and a molecule (zone/atom). So far, you have only been colouring parts of objects that you selected yourself. This is useful, for instance, if you want to paint a particular helix, or residues near the active site or ligand-binding cavity. An even more interesting application of painting is colouring your molecule (i.e., not the object !) according to certain residue or atom properties. Residue properties are all those which are stored in datablocks called "P2A_RESIDUE_xxx", atom properties are stored in datablocks called "P2A_ATOM_xxx". There are four O commands which you can use to change the colour of your molecule (on a per-residue or per-atom basis): "Paint_zone", "Paint_case", "Paint_property" and "Paint_ramp". The "Paint_zone" command is the equivalent of the "Paint_obj_zone" command, except that "Paint_zone" actually changes the colours associated with the atoms. Any objects that were generated previously retain their old colours (since the colour information is stored for each object). Verify this as follows: O > paint_zone Paint> What molecule [P2A ]: Paint> Residue range [all molecule]: a16 a88 Paint> Colour? [red]: medium_blue O > obj test zo ; end The "Paint_case" command is used to colour all atoms for which a certain property has one of a series of possible values in possibly different colours. Try the following: O > mol p2a pai_zon ; ; white O > pain_cas Paint> Colour-case a property in molecule P2A Paint> Property [atom_z] : residue_type Paint> How many cases [8] ? 4 Paint> Enter property values [ 1 2 3 4] : Paint> Property value 1 : trp Paint> Property value 2 : his Paint> Property value 3 : phe Paint> Property value 4 : tyr Paint> Enter 4 colour names: Paint> Colour? [white]: blue Paint> Colour? [blue]: red Paint> Colour? [red]: green Paint> Colour? [green]: yellow O > obj test2 zo ; end What you have done is this: * coloured all atoms in your molecule white * coloured certain atoms in other colours, in case their residue type is Trp (blue), His (red), Phe (green) or Tyr (yellow) * drawn an object with this colouring scheme From this object you can see immediately how many His, Trp, Phe and Tyr residues you have and where they are in the structure and with respect to one another. Which two phenylalanines, which are close in space, have almost parallel stacked rings ? Of course, you can also use properties of individual atoms to colour them. Try to figure out what happens in the following examples: O > pai_zon ; ; white O > pai_case atom_name 4 n ca c o blue yellow red magenta O > obj test3 zo ; end O > pai_zo ;; white O > pa_ca atom_z 3 6 7 8 green blue red O > obj test4 zo ; end Now, see if you can paint the entire protein white, but all arginines, lysines and histidines blue and all glutamates and aspartates red ! Why might this be interesting ? The "Paint_property" command is similar to the previous one, except that you may paint using comparison operators (=, > etc.) rather than just for a few individual cases. Explain what happens in the following examples: O > pai_zon ; ; white O > pai_prop res_type = gly red O > pai_prop res_type = ala red O > pai_prop res_type = trp blue O > pai_prop res_type = arg blue O > obj test5 zo ; end O > pai_zon ; ; white O > pai_prop res_name < a61 red O > pai_prop res_name >= a61 blue O > obj test6 zo ; end Do you think you can colour your molecule according to the secondary structure assignment now ? Here's how it works: O > pai_zon ; ; white O > pai_prop res_2ry = beta sky_blue O > pai_prop res_2ry = alpha orange_red O > obj yasspa ca ; end Check that the two residues for which you changed the secondary structure assignments are indeed coloured white ! The final paint command that you will learn about in this chapter is "Paint_ramp". You can only use this with numerical properties. The example which you shall use here is that of the "O internal residue counter"; for the first residue in your sequence, this number is 1, for the next 2, etc. In this way, you can paint each residue depending on where it is in the sequence. What the "Paint_ramp" command does is to use the value of the property to calculate a colour which lies in between two "extreme colours" which you define. For example: O > O > pa_ram Paint> Colour-ramp a property in molecule P2A Paint> Property [residue_irc] : Paint> Minimum and maximum value of property [1 131] : Paint> First colour [red] : sky_blue Paint> Second colour [blue] : orange_red O > obj test7 ca ; end You now have a drawing of the C-alpha trace in which the N-terminal residue is coloured sky blue, the C-terminal one orange red, and all others have intermediate colours which gradually change from blue to red as you move along the sequence. Using this colouring scheme makes it much easier to follow the chain when you look at the picture ! If you have time left, play around with some of the paint commands. Afterwards, reset the colours of your molecule such that all carbon atoms are yellow again, etc. Don't forget to backup your database before you stop ! 2.4 - It's the Spanish Inquisition ! (1) Explain the difference between the "zone" and the "ca_zone" command. (2) Which commands do you have to type (including their parameters) in order to generate a single object which contains a C-alpha trace of P2, all atoms of both the first and the last residue, plus all atoms of the residues that lie within 5 Å of the C-alpha atom of residue A83 ? (3) How many Å fit into one meter ? (4) Which amino acid types are aromatic ? Which are hydrophobic ? Which are charged ? (5) How many O commands does the following line contain: "mol x obj y ca ; zo a1 ; zo a5 a6 sph 8 end" ? (6) What are the parameters of the following O commands: cover_sphere, zone, yasspa, write_formatted, paint_obj_zone, paint_case, paint_property ? (7) Explain the difference between "Paint_zone" and "Paint_obj_zone". (8) Which command (including all parameters) would you type to reset the default colours for all atoms of a molecule ? (9) In P2 myelin protein, which residues are in the two helices ? How many strands are there ? How many residues are in the longest strand ? (10) One of the atomic properties is the so-called isotropic temperature factor (or B-factor; which datablock is this ?). This number is a measure for the mobility (or disorder) of the atom; the lower the B-factor, the less mobile and the better defined is the atom. Make a zone object of P2 myelin protein in which the atoms are coloured by their B-factors; make atoms with low temperature factors blue and those with high temperature factors red. Which residues contain atoms with the highest B-factors ? Centre on the C-zeta atom of arginine A106 and draw all residues within an 8 Å radius. Is any of these residues red, orange or yellow ? What does this tell you ? (11) Suppose the following commands have been issued for a certain molecule (not P2): "pa_zone ;; white pa_prop res_type = trp blue pa_prop res_type = his magenta obj q1 zo ; end pa_prop res_name < A100 red obj q2 zo ; end pa_case atom_z 4 6 7 8 16 yellow blue red green". Determine which colours the following atoms will have (a) in object "q1", (b) on object "q2", and (c) in the database: the C-alpha atom of Trp A91, the N atom of His A173, the S-gamma atom of Cys A12. Your notes: 3.0 - What's on the menu today ? In this chapter you will learn how to use and customise the O menu on the graphics window, as well as how to change some of the default settings of the program. New O commands: Clear_flags Connect_File Db_Set_data Menu_control Your notes: 3.1 - What's a menu ? Start up O again. If you look at the top right corner, you see the O command "On_off" and beneath it a list of your graphics objects, each preceded by a caret (^). You have already learned that this caret is a special O command which switches an object on and off. You have also learned that you can execute this command in two different ways: * by clicking on the text "^OBJECT" on the graphics screen * or by typing the string "^OBJECT" on the keyboard in the terminal window This is a general feature of O: every command and parameter can be input in these two different ways (in the next chapter you'll learn about even more ways of doing this). The list of "words" (O commands, object names etc.) on the right of the graphics window is called "the menu". And, yes, you can put ANY STRING on this menu. For example, to put the "Clear_id" command on the menu, type the following (exactly as shown): O > menu clear_i on You'll see that the text "Clear_ID" has been appended to the menu; it is shown in purple and with a capital "C". The latter means that O has recognised the fact that "clear_i" is a unique abbreviation of the built-in O command "Clear_ID". Now, click on a few atoms and type "clear_id" on the keyboard. Click some more atoms and bring the cursor over the text "Clear_id" on the menu. Now click on the text "On_off", which is at the top of the menu; watch closely to see what happens to the menu. Click it again; now nothing happens. Can you explain in your own words what the "On_off" command does ? 3.2 - Fast food There is a quick way to put several O commands at the same time on the menu. O commands are grouped internally in so-called "Major Menus". The command to put them on the menu (and to remove them from it) is the same that you used above, namely "Menu_control". Execute this command, and when O asks you for the name of a major menu, just type a question mark (?): O > menu As2> ? gives list of menu names As2> Major menu? (<cr> = refresh)? As2> Available menus: As2> Assorted_1 Draw_Mol O_heap_1 Sketch As2> Lsq_align Assorted_2 Map Slider ... As2> Object_4 Object_5 Let's put the major menu called "Draw_Mol" on the menu: O > menu As2> ? gives list of menu names As2> Major menu? (<cr> = refresh)draw_mol As2> [On]/off: As2> Colour? (<cr> = no change): orange O > on_off All the O commands that have been added to the menu should be quite familiar to you by now, except one (which one ?) which will be discussed later in this chapter. Now make a new object called "P2A" which is a C-alpha trace of the entire P2 myelin protein molecule, but use ONLY the menu to issue the commands: { click "molecule_na" } O > Mol> Molecule code name [P2A ]: { hit the Return key } { click "Object_name" } O > Mol> Name of the new object [P2A ]: { hit the Return key } { click "Ca_zone" } O > Mol> Ca zone [all molecule]: { hit the Return key } { click "End_object" } Let's add another useful major menu, "Assorted_1": O > menu As2> ? gives list of menu names As2> Major menu? (<cr> = refresh)assorted_1 As2> [On]/off: As2> Colour? (<cr> = no change): yellow O > on Again you have added several familiar commands to the menu. Now save your database (using the menu) and watch the menu item "Save_DB" closely while you click it ! If you didn't notice anything special, repeat it. Save your database again, and now watch what happens in the terminal window. Save your database for the third time, and now watch the top left corner of the graphics window. Can you explain what happened ? The first time you saved your database, you saw that the menu item "Save_DB" changed colour for a little while. In general, when you hit an O command on the menu, this item will change colour as long as it is active. The second time, you noticed that as soon as O had executed the command, a new prompt (" O > ") was printed in the terminal window. The third time, you saw that the active command (in fact: ALL active commands) was displayed on the fourth line from the top of the graphics window. This is a useful feature, since you are often engrossed in your work in the graphics window, due to which you easily ignore whatever is written to the terminal window. In the case of "Save_DB", the command is active until the database has been written to a file. Other commands stay active until you have supplied all necessary input (e.g., the "Ca_zone" command; verify this); after that they execute and when this is done, they are no longer active. There are still other commands (which you will run into later), that stay active until you explicitly switch them off. Sometimes, you accidentally hit a wrong item on the menu. Depending on which item you hit, different things may happen: * the command requires no further input and executes immediately. Could you give an example of such a command ? * the command requires some input, but it can be rendered harmless. For example, if you accidentally hit "Delete_obje", just hit the Return key when O asks you which object should be deleted. * other commands can often be deactivated by hitting one of the O commands that you just put on the menu, namely "Clear_flags". * a few commands have their own "reset" command; you will meet some of these commands later. 3.3 - A la carte On your current menu there are some commands which are easier to enter from the keyboard, for example the "Ca_zone" command. You will find that it is much quicker to type the command plus the commands that usually accompany it (which ?) plus the parameters. It's quicker to type "mol p2a obj t1 ca ; end" than it is to click on "molecule_na", go to the terminal window, hit the Return key, go back to the menu, click "Object_name", etc. etc. Also, you should NOT have the "Stop" command on the menu (it's too easy to hit this by accident and waste time by having to restart O). Moreover, there are some commands that you don't know how to use yet, and there may be commands missing which you would like to add to the menu ("Paint_object", for instance). There are two ways to change the menu. One you have already encountered, namely using the O command "Menu_control". For example: O > menu wait_id off O > menu connect_file off O > menu pai_objec on on However, if you want to make many changes, it is handier to use the following method. It won't come as a surprise that the list of menu items is stored as a datablock in your database: O > dir *menu* Heap> .MENU_MAJOR_NAME C W 76 Heap> .MENU_MINOR_NAME C W 758 Heap> .MENU_COLOUR I W 38 Heap> .MENU_DISPLAYED I W 38 Heap> .MENU_VISIBLE I W 38 Heap> .MENU_INTEGER I W 3 Heap> .MENU_REAL R W 2 Heap> .MENU T W 364 The datablock that you are looking for is called ".MENU". Write this datablock to the screen and subsequently also to a file (call this file "mymenu.odb", for example): O > wr .menu ;; .MENU T 28 12 Object_name molecule_nam Ca_zone ... ^YASSPA ^TEST7 ^P2A O > wr .menu mymenu.odb Heap> Format: Now use another terminal window and edit this file. You may change, add and delete items from the list at will. When you edit the file, you will notice that the first line is rather special. It may look like this: ".MENU T 28 12". The first item, ".MENU", is the name of the datablock; do not change this ! The second item, "T", identifies the type of the datablock, in this case Text (which other datablock types do you know ?). The third number is CRUCIAL: this number MUST be the number of items in the datablock. In the case of a text datablock, this is always equal to the number of lines in the file, minus one (why ?). The last number is the length of each item, in this case twelve characters. This means that only the first 12 characters of each menu item are read and stored. The simplest way to ensure that the number of items (lines) is correct is: * edit the file and save it * use the Unix utility "wc" ("word count"): the number of items in the text datablock is the first number that is printed, minus one * update this number in the file, save it again and quit from the editor unix > jot mymenu.odb { edit menu and save file } unix > wc mymenu.odb 20 25 293 mymenu.odb { change number of items to "19"; save and quit } Read the file back into O and watch the menu change: O > read mymenu.odb Your menu may now look as follows: O > wr .menu ;; .MENU T 19 12 Centre_ID Clear_ID Clear_flags Yes No Save_DB Paint_object ca ; end On_off ^first ^test ^test2 ^test3 ^test4 ^test5 ^test6 ^yasspa ^test7 ^P2A Note that a menu item may contain spaces and even multiple O commands ("ca ; end") ! Now add the following two lines to your menu: "obj sphere" and "sph ; end". Check that they do what you would expect them to do. 3.4 - The customiser is king Make a C-alpha trace of P2 myelin protein and make this the only visible object on the screen. Find the N-terminus of the protein and click on the C-alpha atom of the first residue. The label "A1 CA" should appear in red next to the atom. Now click on the C-alpha atoms of residues two, three, four and five. It looks as if you can have only three labels on the screen simultaneously. Can't you have more ? And why do they have to be coloured red ? And why can't you have the residue type, rather than or together with, the residue name as the label ? The answers to these questions are: "yes, you can", "they don't" and "you can". It shouldn't come as a surprise that these features are controlled through datablocks again. Changing them from their default values to something else is called "customising". In fact, you have already customised something, namely the menu. The advantage of customising is that you can make O look/feel/work in a way that you are comfortable with. There are several ways in which customisation can take place, but ALL of them involve changes in one or more datablocks in your O database: * some groups of commands have a special "Setup" command with which you can set values for certain parameters which determine what the commands will do or how they will do it (you shall meet some of these later) * other commands ask for a parameter the first time you execute them, and the value that you enter is stored and used in the future (the very first O command that you learned works this way) * many default settings can be changed by editing a datablock * some settings can be changed by adding a datablock to your database Consider the "Save_DB" and "Backup_DB" commands: the very first time you executed these, they asked you to supply a file name. Ever since, O has been saving and backing up your database to these files. It's obvious that the names of these files must have been stored in the database (why is this obvious ?). O > dir *file* Heap> FILE_O_SAVE T W 73 Heap> FILE_O_BACKUP T W 73 Heap> FILE_DISPLAY_CONNECTIVITY T W 73 O > wr FILE_O_SAVE ;; FILE_O_SAVE T 1 72 p2.o O > wr FILE_O_BACKUP ;; FILE_O_BACKUP T 1 72 p2_backup.o Can you change the name of the backup file to something else ? Check that it actually works ! Okay, but how do you know which datablock to change, and how to change it, in order to change a certain parameter ? The only reliable source of such information is the O Manual. Most chapters end with a listing of important datablocks and an explanation of some of the items stored in them. Some datablocks are also described briefly in appendix 10.6 of this tutorial. Type the following sequence of commands and check what happens: O > wr .MOLEC_OBJ_REAL ;; O > obj sph sphe 8 end O > wr .MOLEC_OBJ_REAL ;; Apparently, the datablock named ".MOLEC_OBJ_REAL" contains one real number which is the radius used by the "Sphere_centre" command. This explains how O comes up with the same value you used before as the default when you type only "Sphere_centre": O > obj sph sphere Mol> Residues will be chosen if within a radius of [ 8.00] : O > end Now type the following commands: O > db_set_data .molec_obj_real 1 1 12.5 O > obj sph sphere Mol> Residues will be chosen if within a radius of [ 12.50] : O > end The "DB_set_data" command can be used to set one or more items in an O datablock to a certain value. This command, however, can NOT be used to change datablocks of type Text; these you'll have to write to a file and edit yourself: O > db_set_data Heap> Full name of data block: .menu Heap> Since this is very dangerous, we think Heap> you should write the block out and then Heap> edit it with the regular text editor. Okay; now change the maximum number of labels as follows (and check that it worked; also explain what the parameters of this command are): O > db_s_d .molec_obj_integer 9 9 10 Changing their colour is a bit more involved. O stores colours internally in a rather arcane way, namely as positive integers which are encoded and decoded by various O commands. The colour red, for example, is represented by the number "16711680" (which is not entirely intuitive ...). The simplest way for you to find the number that corresponds to a certain colour is the following: * execute the "Paint_colour" command and enter the NAME of the colour you want to use * O encodes this colour and the resulting number is stored in a datablock called ".ACTIVE_COLOUR"; hence, write the contents of this datablock to the screen * use this number to set the eighth element of the datablock ".MOLEC_OBJ_INTEGER" O > pai_col sky_blue O > wr .active_colour ;; .ACTIVE_COLOUR I 1 (10(x,i7)) 3316172 O > db_s_d .molec_obj_integer 8 8 3316172 Click on some atoms and look what happens to the labels. Right - now you've changed the number of labels and their colour. But how about the actual text of the labels ? This, and the text which is shown on the third line from the top of the graphics window whenever you click on an atom, is controlled via a so-called "template". Type the following: O > dir *templ* Heap> .MESSAGE_TEMPLATE T W 369 O > wr .MESSAGE_TEMPLATE ;; .MESSAGE_TEMPLATE T 9 40 %MOLNAM %RESNAM %Restyp %ATMNAM, xyz = atom_xyz ; B = atom_b ; Z = atom_z atom_bone ; residue_2ry_struc The datablock ".MESSAGE_TEMPLATE" controls what is written at the third line of the graphics window. It is called a template since it contains elements whose VALUE will be substituted when an atom is actually clicked: * "%MOLNAM" will be replaced by the name of the molecule * "%RESNAM" by the name of the residue * "%Restyp" by the type of the residue * "%ATMNAM" by the name of the atom you clicked on * ", xyz = " is a text which will be printed literally * "atom_xyz" (on a line by itself !) will be replaced by the X, Y and Z coordinates of the atom * "; B = " is a literal text again, etc. Edit this datablock so that it looks as follows (remember to update the number of lines in the datablock !): O > wr .MESSAGE_TEMPLATE ;; .MESSAGE_TEMPLATE T 5 40 %MOLNAM %RESNAM %RESTYP %ATMNAM ; Coords atom_xyz ; Sec Struc residue_2ry_struc What effect did changing the word "%Restyp" to "%RESTYP" have ? The contents of the labels is also controlled by a datablock, this one called ".ID_TEMPLATE". Write ".MESSAGE_TEMPLATE" to a file, edit this file, change the name of the datablock to ".ID_TEMPLATE" and design a useful label. Read the file in again and check the results. O > wr .id_template ;; .ID_TEMPLATE T 2 40 %Restyp %RESNAM %ATMNAM residue_2ry_struc 3.5 - Connecting the dots differently A little while ago, you came across the "Connect_file" command. Execute this command, and when O asks for a file name, just accept the default. O > conn_fil Mol> Connectivity file? [/nfs/taj/alwyn/o/data/all.dat]: Mol> Maximum inter-residue link distance = 2.00 Mol> There were 23 residues. Mol> 175 atoms. The file which you just read in determines which residue types and atoms are recognised by O, and which bonds are to be drawn within each residue type. There is another ready-made file, in the same directory, called "o.dat". Use this connectivity file and draw a zone of the entire P2 molecule: O > con_fil o.dat Mol> Maximum inter-residue link distance = 6.00 Mol> There were 23 residues. Mol> 113 atoms. O > mol p2a obj o zo ; end Now use Unix to copy "o.dat" to your own directory (call it "weirdo.dat"); edit the file such that, for amino acids, only the C-alpha atoms and the atom that is furthest from the C-alpha atom in the side chain are drawn (if two atoms are the same number of bonds removed from the C-alpha atom, select one at random). Use this file to draw another zone of P2 and check that you have changed the file correctly. O > con_fil weirdo.dat O > obj weird zo ; end These connectivity files define explicit bonds for amino-acid residues. If you would read in a DNA molecule now, bonds would be drawn using a distance criterion (which may be wrong, especially if hydrogen atoms are included in the structure). Be sure to read either "all.dat" or "o.dat" in again when you're done with "weirdo.dat" ! 3.6 - What's up, Doc ? (1) Explain the two different ways in which the "menu_control" command can be used. What are the parameters of this command for both cases ? (2) Why is it useful to have the semi-colon character (;) on the menu ? (3) What may happen, or what can you do, when you accidentally hit a wrong command on the menu ? (4) Save your database. Press down the key marked "Ctrl" (the Control key) and keep it down as you hit the "C" key. What happened ? (5) Which four types of O datablocks do you know ? Give an example of each of these from your own database. (6) What are the parameters of the "db_set_data" command ? (7) What are the O colour codes for black, white, yellow, cyan and magenta ? (8) What is the difference between using "%RESTYP" and "%restyp" in the message template ? (9) Explain the difference between a zone drawn with "all.dat" and one drawn with "o.dat". If you can't, look at the entry for glycine in both files and compare these. Your notes: 4.0 - Again and again and again ... In this chapter you will learn how to create and use macros (files with sequences of O instructions which you execute often). New O commands: Bell_ring If_yes_no Message No Print Spawn Symbols Terminal_ID Wait_ID Yes ! # (symbols)

$(Unix) @ Your notes: 4.1 - Recipes In the previous chapter you learnt a lot about the O menu; now it's time to take a look at some recipes. As you have probably noticed by now, there are many sequences of O commands which you execute time and again. Some of these sequences are always identical, others just vary in one or two parameters (e.g., the name of the molecule or object, or the zone to which a command applies). O contains a mechanism which allows you to type such sequences of commands only once, and to execute them as often as you like. This mechanism is called the "O macro facility". The idea is that you write little O "programs", i.e. series of O commands, in a file, and from then on "execute" this file, rather than typing the commands again. Use an editor to create a file (in your work directory) called "yasspa.omac" which contains the following commands: ! yasspa.omac print ... yasspa.omac ... print First I will run YASSPA on molecule P2A print Then I will make an object called "yasspa" print In which the CA atoms are coloured red print If they are in a helix or blue if they print Are in a strand message Please wait a little while ... mol p2a yasspa p2a alpha 0.5 yasspa p2a beta 0.8 paint_zone p2a ; white paint_property res_2ry_struc = "ALPHA" red paint_property res_2ry_struc = "BETA" blue obj yasspa ca ; end bell on_off message Done There are several new O commands inside this file: * an exclamation mark (!) followed by any text marks a line as being a comment line; it will be ignored by O. Use this in order to document your macros so that you will remember what they do even if you haven't looked at them for half a year. * "Print" followed by a text will make O type that text to the terminal window * "Message" followed by a text will make O put that text on the second line from the top of the graphics window * "Bell_ring" rings the terminal bell; use this to alert the user (usually, yourself) that input is required, or that a macro has finished its job Now execute the macro by typing an "at" sign (@), directly followed by the complete and exact name of the file (no spaces !): O > @yasspa.omac O > Macro in computer file-system. As4> ... yasspa.omac ... O > As4> First I will run YASSPA on molecule P2A O > As4> Then I will make an object called "yasspa" O > As4> In which the CA atoms are coloured red O > As4> If they are in a helix or blue if they O > As4> Are in a strand O > O > O > Util> Template size : 5 residues. Util> There were 17 O > Util> Template size : 5 residues. Util> There were 75 O > O > O > O > O > O > O > O > Well done - you have now written and executed your first O macro ! There is one little problem with this macro, though: it will only work for a molecule called "P2A". Also, the object will always be called "YASSPA", so you can't run it twice on different molecules (why not ?). Normally, when you type "Molecule_name" followed by the Return key, O asks you which molecule you want to select. However, when O executes a macro, it expects all parameters to commands to be in that file. In other words, if you would use "mol" instead of "mol p2a", O would read from the next line, find the word "yasspa" and assume that this is the name of the molecule you want to use. Of course, you don't have such a molecule in your database and the macro doesn't work as you hoped. Fortunately, there is a way around this. In a macro, you may replace any parameter by a question to the user, enclosed in two hash signs (#). For instance, you could replace the line "mol p2a" by something like: "mol # Which molecule should I use ? #". Do this, and also change the line "obj yasspa ca ; end" so that the user can type the name of the object himself or herself and can type either "ca" or "zone". Execute the altered macro to check that it works: O > @yasspa.omac O > Macro in computer file-system. As4> ... yasspa.omac ... O > As4> First I will run YASSPA on molecule P2A ... O > O > O > Which molecule should I use ? p2a O > Util> Template size : 5 residues. Util> There were 17 O > Util> Template size : 5 residues. Util> There were 75 O > O > O > O > O > What should I call the object ? yazoo O > O > Do you want a CA trace or a ZOne (CA/ZO) ? zo O > O > O > O > How many aromatic residues are not in any helix or strand ? Can you write a macro which asks the user to enter the name of a molecule and then paints carbons yellow, oxygens red, nitrogens blue, sulphurs green, hydrogens magenta and all other elements white ? Execute this macro for your P2 molecule and check that it works all right. 4.2 - What's the ID ? Often you will want to be able to select an atom or a zone of residues while executing a macro. For example, if you want to write a macro which draws a sphere of residues around any atom that you click on, using the radius that you used last. Try to write such a macro and execute it. Once you've done this, it's probably time to execute the "Clear_flags" command ... The problem is that O puts "Centre_ID" (which you probably used in your macro) on the list of active commands, ditto for the "Sphere_centre", then gets to and executes the "End_object" command and tells you that there are no bonds to be drawn. To make O wait after the "Centre_ID" until you have actually picked an atom, you have to insert the command "Wait_ID": ! sphere.omac ! pick an atom and draw a sphere of residues centre_id wait_id obj sph sphere ; end Sometimes, when you want to look at a number of specific residues, you may want to enter the atom name via the keyboard (in particular if you are working with a molecule you're not very familiar with yet). In that case, use the "Terminal_ID" command. This tells O to expect a molecule name plus residue name plus atom name from the keyboard in the terminal window, rather than from a click in the graphics window: ! sphere_term.omac ! id an atom with the keyboard and draw a sphere of residues centre_id message "Sphere around which atom ?" term_id # Mol, residue, atom ? # sph obj sph sphere ; end Type this macro in a file and execute it. O > @sphere_term.omac O > Macro in computer file-system. O > O > O > Mol, residue, atom ? p2a a131 c 4.3 - What if ? Occasionally, you may want to ask the user a question inside a macro, and, depending on the answer ("Yes" or "No", which, by the way, are also O "commands") execute one macro or another. For this purpose, you may use the command "If_yes_no". It has two parameters: * the name of a macro which is executed when the user replies Yes * the name of a(nother) macro which is executed in the case of a No Both macro names must include the @ sign. Use this command to write three general O macros: * paint_mol.omac - which asks the user to enter the name of molecule, selects this molecule and asks the user whether (s)he wants to colour this molecule according to secondary structure assignment, or such that aromatic residues are drawn in green and all others in white; after the "If_yes_no" it should reset the normal colours for the most common chemical elements * paint_aromatic.omac - which colours the molecule such that aromatic residues are drawn in green and all others in white, and draws a zone of the entire molecule in an object with a user-definable name * paint_yasspa.omac - which colours the molecule according to secondary structure assignment and draws a C-alpha trace in an object with a user-definable name 4.4 - Symbolically speaking There is yet another way to make macros in particular operate in general cases. This is done with the "symbol mechanism" in O. The associated command is "Symbols": O > symbol As2> Here are the current symbols : As2> .ID_M P2A As2> .ID_R A131 As2> .ID_A CA As2> Symbol name : user As2> Symbol expansion (<CR>=delete symbol from list) : Gerard As2> Symbol inserted. You now have a symbol called "USER"; however, the CONTENTS, or value, of this symbol is the text "Gerard". You can use the value of a symbol, both in typed commands and in macros, by putting a dollar sign ($) immediately in front of it (e.g., "$USER"). Check that this works: O > print ... My name is$user

As4> ... My name is Gerard O > symbol

As2> Here are the current symbols :

As2> .ID_M P2A

As2> .ID_R A131

As2> .ID_A CA

As2> USER Gerard

As2> Symbol name :

As you can see, there are three predefined symbols in O: ".ID_M", ".ID_R" and ".ID_A". Their VALUES are the name of the molecule/residue/atom, respectively, which was last identified. Click on any atom to verify that the values of these symbols change accordingly.

Usually, you will define symbols in macros as follows: "symbol my_symbol # Enter the value of ... #".

4.5 - Unix speaking

There are two ways to "talk to Unix" from within O. If you want to execute just one or two Unix commands from inside O, you may use the "$" command. It's a bit confusing that this command is called "$" (see the previous section), but that's the way life is. To execute a Unix command, type a dollar sign followed by one or more spaces and the entire Unix command (i.e., including all its arguments and parameters). For example, if you want to change the menu:

O > write .menu q ;

O > $jot q O > read q If you want to do more Unix work, you may use the "Spawn" command. This hands control to Unix, until you type "exit" from there. However, with modern, window-based workstations this command is not really necessary anymore, since it's so easy to start a new terminal window to do with Unix whatever you want to do. 4.6 - Do-it-yourself ! Try to write general macros which do the following (if you haven't written them yet): * cnosh_colours.omac - reset the normal colours for the elements C, O, N, S and H for the current molecule * acid_base.omac - this should select a molecule; colour acidic residues red, basic ones blue and all others yellow; draw an object "ACIBAS" with a C-alpha trace and reset the colours using cnosh_colours.omac * change_id.omac - this should ask the user for a colour code and the maximum number of labels which can be displayed simultaneously * rainbow.omac - this should draw a C-alpha trace of a user-defined molecule, where the colours vary from red at the N-terminus to blue at the C-terminus * yasspa.omac - this should run YASSPA and draw a C-alpha trace coloured according to secondary structure assignment Check section 10.4 for some other useful macros. You'll learn a thing or two by studying other people's macros ! There are two more things that are good to know about macros: * you can store macros in files, but if they are extremely general, you may want to have them in all your O databases. In that case, store the macro in a text datablock. Call the datablock "@your_macro", i.e. include the at sign (@) as the first character of the datablock name. * this has an additional advantage, namely that you can use short names (which, in turn, makes it possible to put them on the menu !). To demonstrate these two points, create a file which contains the following (call it "all_on_off.odb"): ! @all_on_off - macro to toggle ALL objects ON/OFF @all_on_off t 6 30 on_off message Wait write .menu q.1 ;$ grep '\^' q.1 > q.2

@q.2

\rm q.1 q.2 message Done ! type: menu @all_on_off on on Read the file into O, put the macro on the menu and execute it: O > read all_on_off.odb Heap> @all_on_off Heap> type: menu @all_on_off on on O > menu @all_on_off on on O > O > Macro in database. O > O > O > O > Macro in computer file-system. O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > O > Note that the comment lines in the O datablock file are now printed in the terminal window (use this to include instructions to the user on how to install and use the macro). Execute the macro again to restore your old menu. 4.7 - Inter-course fun Yes, yes, chapters three and four are a trifle boring because they are so technical and have fairly little to do with proteins specifically. Therefore, treat yourself to an "Aha-Erlebnis"; type the following commands literally (don't worry about what the commands do, just sit back and enjoy yourself): O > ce_zo p2a a1 a131 O > @omac/sketch_setup.omac O > sketch_auto p2a ; ; O > on_off O > spin The "Sketch_auto" command has created a macro inside your database which actually produces the picture. Can you find out the name of this macro ? Try to figure out what the macro does. If you can't get enough, play around with some of the function keys ("F-keys", they are in a row at the top of your keyboard); you have already encountered F6 and F7, but you may also be interested in the keys F12, F10, F9 and F8. 4.8 - Tell me why (1) Explain what a macro is. (2) Explain how symbols can be defined, listed and used in O. (3) Discuss the two different uses of the dollar sign in O. (4) What is the difference between "print" and "message" ? (5) What is the difference between "wait_id" and "terminal_id" ? 5.0 - What a super position ! In this chapter you will learn how to superimpose the structures of two similar proteins and how to quantify their similarity. New O commands: Copy_db Db_delete Db_kill Dial_next Dial_previou Lsq_Paired_a Lsq_explicit Lsq_improve Lsq_molecule Lsq_object Move_object O_setup Sam_atom_out Your notes: 5.1 - All lipocalins are equal ... ... But some lipocalins are more equal than others, George Orwell might have said (if he had lived today and had been a protein scientist). This leads us to the subject of structural similarity of proteins. How do you find out if (or: which) proteins are similar ? How do you analyse this similarity, use it on the display, quantify it ? The first question is the most difficult one; one answer might be "through experience", another "by looking at each and every structure that is or has been solved", a third "by using a sufficiently clever program". O cannot be used to detect similarities between different protein structures. Suffice it to say that there is an accompanying program (called "DEJAVU") which can do this. However, use of this program is well beyond the scope of this tutorial. P2 myelin protein, the protein you've been looking at and playing with so far, is a lipocalin, or lipid-binding protein (LBP). There's a whole family of lipocalins; all of them have similar three-dimensional structures and they bind similar ligands. In a separate exercise, you have retrieved a PDB file containing the structure of one member of this family. The example below uses the coordinates of cellular retinol-binding protein, CRBP, to illustrate the use of O. However, since you have another protein to compare to P2, the names and types of the residues that you compare, as well as the numbers that you obtain, will be different from those shown below ! Read your lipocalin molecule into O, give it a sensible name, draw a C-alpha trace (coloured from red to blue going from N- to C-terminus) and a zone object. Check if your PDB file contains one or more than one copy of the protein. Centre on the middle of the (first) molecule. Run YASSPA (use your macro from the previous chapter !) and compare the positions of the helices and strands with those in P2 myelin protein (this may give you an idea of how to align the structures !). Use the paint commands ! Does your structure contain a ligand ? If so, generate a separate object with only the ligand in it (use the "Zone" command). Does it contain anything else ? If so, can you guess what it is ? O > s_a_i crbp.pdb crbp Sam> File type is PDB Sam> Database compressed. Sam> Space for 136501 atoms Sam> Space for 10000 residues Sam> Molecule CRBP contained 248 residues and 1236 atoms O > sam_list crbp Sam> Name Type From To Centre Radius Sam> 1 PRO 1 7 5.32 -13.06 -14.35 2.44 Sam> 2 VAL 8 14 4.21 -9.66 -11.64 2.49 ... Sam> 133 VAL 1085 1091 -2.06 4.07 -2.92 2.77 Sam> 134 HIS 1092 1102 -3.00 5.35 -7.96 3.62 Sam> 200 RTL 1103 1123 16.33 4.35 -2.23 7.80 Sam> 201 CD2 1124 1124 2.80 -16.23 -11.66 0.00 Sam> 202 CD2 1125 1125 13.39 -12.80 3.63 0.00 Sam> 203 HOH 1126 1126 14.26 0.68 9.40 0.00 ... Sam> 314 HOH 1235 1235 20.42 10.07 -5.77 0.00 Sam> 315 HOH 1236 1236 35.08 1.99 -6.12 0.00 O > mol crbp O > pai_ramp Paint> Colour-ramp a property in molecule CRBP Paint> Property [residue_irc] : Paint> Minimum and maximum value of property [1 248] : 1 134 Paint> First colour [blue] : red Paint> Second colour [red] : blue O > obj crbpt ca ; end O > obj crbpz zone ; end O > O > ce_zo crbp 1 134 As4> CRBP 1 134 CRBPZ As4> Centering on zone from 1 to 134 O > @yasspa.omac O > Macro in computer file-system. O > Which molecule ? crbp O > O > Util> Template size : 5 residues. Util> There were 17 O > Util> Template size : 5 residues. Util> There were 75 O > mol crbp obj ligand O > zone 200 ; end 5.2 - Match-makers The first thing you have to do is to get an initial idea of which residues in your molecule correspond to which residues in the structure (!) of P2 myelin protein. You could for example compare where the helices are. In P2 the first helix (according to YASSPA) runs from Phe A16 to Leu A23, and the second from Leu A27 to Leu A35. In CRBP these helices run from Phe 16 to Leu 23 and from Val 27 to Leu 35, respectively. Apparently, P2 and CRBP are VERY similar. Before you try to get an initial alignment, it's good practice to use a new O command to align two similar protein structures roughly by hand: "Move_object". When you type this command, O expects you to click on a molecular object. After that, you can use the dials (or the pseudo-dials if you hit the F6 key) to rotate and translate the object. Execute this command and type the name of your lipocalin object (i.e., NOT on P2 myelin protein). (Hint: if your PDB file contained more than one protein molecule, make a new object of only one of them and use that object here.) Use the dials to position your lipocalin on top of P2 as best as you can. O > move_object Mnp> What object is to be moved around ? yasspa Mnp> Fragment pivot point: 0.000 0.000 0.000 Wait a minute: you can only translate the object (unless you use the pseudo-dials) ! Time for another O command: "Dial_previous". You'll probably want to put this command and it's little sister, "Dial_next", on your menu: O > menu dial_next on on O > menu dial_prev on on O > dial_prev Check the bottom-left corner of the graphics window as you click these two commands in turn: they toggle the assignment of three of the physical dials between rotation and translation of your object. Another thing you may want to do as soon as your molecular object is in the neighbourhood of P2 is to centre on the middle of P2: O > ce_zo p2a a1 a131 As4> P2A A1 A131 YASSPA As4> Centering on zone from A1 to A131 Continue rotating and translating your molecule until you're happy with the fit; then click or type "Yes" and answer the question that follows: O > yes Mnp> The trnasformation s saved in ([obj_rt]): crbp_to_p2 No, this has nothing to do with the "Formation of tRNAs" ... It merely shows that some programmers write better Fortran than English. In fact, O asks you for the name of a datablock in which the transformation that you applied by hand can be stored. Have a look at this datablock: O > wr CRBP_TO_P2 ; (3f15.8) CRBP_TO_P2 R 12 (3f15.8) 0.71129280 0.49543828 0.49860156 -0.64166176 0.74725312 0.17286676 -0.28693673 -0.44289240 0.84941959 38.35695267 58.46720123 27.90679932 It contains twelve numbers; the first nine constitute a unitary rotation matrix, the last three a translation vector (in Å); the twelve numbers together define a so-called (RT) operator. If the numbers in the operator are denoted R1 to R12 as follows: R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 then the relationships between the coordinates of a point (X',Y',Z') after application of the transformation and the point (X,Y,Z) from which it originates are: X' = R1 * X + R4 * Y + R7 * Z + R10 Y' = R2 * X + R5 * Y + R8 * Z + R11 Z' = R3 * X + R6 * Y + R9 * Z + R12 Now find a pair of C-alpha atoms, one in P2 and one in your lipocalin, which are very close together now that you have applied the transformation by hand, for example two residues in corresponding helices. Find out what coordinates these two atoms have. For the atom in P2 you can simply click on it and read the coordinates from the information line in the graphics window. Unfortunately, you cannot click on atoms in your transformed object (since O still thinks that the molecule is at its old position, which it actually IS since you haven't changed the coordinates of the molecule in the database ! All you've done is to move an OBJECT around!). However, you could either make a new object of your molecule, centre on it and find the corresponding residue, or use the "Terminal_id" command: O > term_id p2a a20 As1> Object name? [YASSPA]: p2a { read coordinates: 51.80 68.08 44.45 } O > term_id crbp 20 ca ; { read coordinates: 23.12 1.87 4.99 } Apply the transformation (which you saved after moving your object) to the coordinates of the C-alpha atom in your lipocalin to find the coordinates of the transformed atom (you may want to use the Unix calculator "xcalc" if you don't have a pocket calculator at hand): X' = 0.711 * 23.12 + -0.642 * 1.87 + -0.287 * 4.99 + 38.357 = 52.16 Y' = 0.495 * 23.12 + 0.747 * 1.87 + -0.443 * 4.99 + 58.467 = 69.10 Z' = 0.499 * 23.12 + 0.173 * 1.87 + 0.849 * 4.99 + 27.907 = 44.00 In other words: the C-alpha of residue A20 in P2 myelin protein is at (51.80,68.08,44.45) and the C-alpha of the corresponding residue 20 in CRBP is at (52.16,69.10,44.00). What is the distance between these two matched C-alpha atoms after transformation of CRBP ? Distance = SQRT ( (X1-X2)^2 + (Y1-Y2)^2 + (Z1-Z2)^2 ) = = SQRT ( 0.36^2 + 1.02^2 + 0.45^2 ) = = SQRT ( 1.3725 ) ~ 1.17 Å How well did you do ? Did your C-alpha atoms lie within a distance of 1 Å, 1.5 Å, 2 Å or more ? One way to quantify the similarity between two protein structures is: * superimpose the two molecules "as best as you can" * count how many C-alpha atoms have distances less than x Å * calculate the RMSD of these C-alpha atoms The "RMSD" is the "root-mean-square distance"; it is calculated by summing the squares of the distances of the matching atoms, dividing by the number of atoms, and taking the square root of this number. For instance, if you have three matching atoms (for simplicity !) with distances of 1.17, 1.35 and 0.98 Å respectively, then the RMSD is calculated as follows:RMSD = SQRT ( (1.17^2 + 1.35^2 + 0.98^2) / 3 ) = = SQRT ( 4.1518 / 3 ) ~ 1.18 Å The RMSD is thus a kind of average distance between matched atoms, except that (due to the squares) larger distances are weighted somewhat heavier. 5.3 - Operator, what's the number ? It's time to stow the calculators and to let O do some of the hard work. Aligning two structures usually entails three steps: * finding an initial alignment (output from a program, manual rotation and translation, alignment of certain structural elements) * getting an initial operator from this alignment * optimising the alignment You have already carried out the first step. Now there are two possible ways to proceed. The first is to identify stretches of corresponding residues in P2 and your lipocalin and feeding them to the O command "Lsq_explicit": O > lsq_explicit Lsq > Least squares match by explicit definition of atoms. Lsq > Given 2 molecules A, B the transformation rotates B onto A Lsq > What is the name of A (the not rotated molecule)? p2a Lsq > What is the name of B (the rotated molecule)? crbp Lsq > Now define what atoms in A [=P2A] are to be matched to B [=CRBP] Lsq > Defining 3 names in P2A implies a zone and an atom name. Lsq > Defining 2 names in P2A implies a zone and CA atoms. Lsq > Defining 1 name in P2A implies the CA of that residue. Lsq > Molecule CRBP just requires the start residue and atom name. Lsq > A blank line terminates input. Lsq > Define atoms from P2A (the not rotated molecule): a16 a23 ca Lsq > Define atoms from CRBP (the rotated molecule): 16 Lsq > Define atoms from P2A (the not rotated molecule): a27 a35 Lsq > Define atoms from CRBP (the rotated molecule): 27 Lsq > Define atoms from P2A (the not rotated molecule): Lsq > The 17 atoms have an r.m.s. fit of 0.618 Lsq > xyz(1) = 0.7280*x+ -0.6346*y+ -0.2595*z+ 37.6558 Lsq > xyz(2) = 0.6080*x+ 0.7725*y+ -0.1835*z+ 53.4470 Lsq > xyz(3) = 0.3169*x+ -0.0242*y+ 0.9481*z+ 32.3691 Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_crbp_to_p2a Make sure that you identify the molecules correctly, i.e. P2 is the "not rotated molecule" ! Also, be careful in specifying the zones which are to be aligned (check the residue names !). In this case, O was told to align the 17 alpha-helical residues on their C-alpha atoms. This gave an RMSD of 0.62 Å (twice as good as the example match calculated earlier). Compare the operator printed by O to the one you obtained by manual rotation and translation. They should be fairly similar. Note that the operator is stored in a datablock with a partially fixed name: ".LSQ_RT_" plus any name that you want to give it (use meaningful names !). Make a new C-alpha trace of your lipocalin (call it "LSQO") and apply the new operator to it with the O command "Lsq_object": O > lsq_obj Lsq > Apply a transformation to an existing object. Lsq > There are these transformations in the database Lsq > CRBP_TO_P2A Lsq > Which alignment [<CR>=restore a transformed object] ? crbp_to_p2a Lsq > There is an object called FIRST ... Lsq > There is an object called LSQO Lsq > Which object ? lsqo The alignment you have now is based solely on the alignment of the two helices. Fortunately, O contains a command which will try to improve the alignment by looking for long stretches of matching residues in both proteins. This command is called "Lsq_improve": O > lsq_imp Lsq > Least squares match by Semi Automatic Alignment. Lsq > There are these transformations in the database Lsq > CRBP_TO_P2A Lsq > Which alignment ? crbp_to_p2a Lsq > Given 2 molecules A,B the transformation rotates B onto A Lsq > What is the name of molecule A [P2A ]? Lsq > Zone to look for alignment [all molecule A] : Lsq > What is the name of molecule B [CRBP ]? Lsq > Zone to look for alignment [all molecule B] : Lsq > What atom [CA] ? Lsq > Number of atoms in A/B to look for alignment 131 134 Note that you don't have to type the entire name of the operator since O expects operators to begin with ".LSQ_RT_". Once you give the name of the operator, O "remembers" which two molecules were compared. The defaults are to look for matching residues in the two complete molecules and to use the C-alpha atoms. After this, O starts looking for long connected fragments; when it's finished, it will calculate a new operator by fitting only these residues. The new operator, the number of matched residues (C-alpha atoms) and their RMS distance are printed: Lsq > A fragment of 41 residues located. Lsq > A fragment of 30 residues located. Lsq > A fragment of 15 residues located. Lsq > A fragment of 6 residues located. Lsq > A fragment of 4 residues located. Lsq > Loop = 1 ,r.m.s. fit = 1.393 with 96 atoms Lsq > x(1) = 0.7102*x+ -0.6767*y+ -0.1943*z+ 37.8776 Lsq > x(2) = 0.6238*x+ 0.7328*y+ -0.2719*z+ 53.9637 Lsq > x(3) = 0.3264*x+ 0.0719*y+ 0.9425*z+ 31.7407 With this new operator, O repeats the fragment search process. This continues until no further improvement can be achieved: Lsq > 0Search for connected fragments. Lsq > A fragment of 42 residues located. Lsq > A fragment of 30 residues located. Lsq > A fragment of 27 residues located. Lsq > A fragment of 24 residues located. Lsq > Loop = 2 ,r.m.s. fit = 1.207 with 123 atoms Lsq > x(1) = 0.6830*x+ -0.7115*y+ -0.1655*z+ 38.0744 Lsq > x(2) = 0.6473*x+ 0.6945*y+ -0.3141*z+ 53.9228 Lsq > x(3) = 0.3385*x+ 0.1074*y+ 0.9348*z+ 31.6207 ... Lsq > 0Search for connected fragments. Lsq > A fragment of 42 residues located. Lsq > A fragment of 32 residues located. Lsq > A fragment of 27 residues located. Lsq > A fragment of 24 residues located. Lsq > Loop = 4 ,r.m.s. fit = 1.270 with 125 atoms Lsq > x(1) = 0.6795*x+ -0.7158*y+ -0.1610*z+ 38.1367 Lsq > x(2) = 0.6527*x+ 0.6900*y+ -0.3130*z+ 53.8917 Lsq > x(3) = 0.3351*x+ 0.1076*y+ 0.9360*z+ 31.6551 After the final cycle, you must supply the name of the datablock in which the new operator should be stored (this may be the same name as that of the original operator datablock): Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_crbp_to_p2a Finally, O prints a list of the residues that were matched in the two proteins. In this case, four continuous stretches (zones or fragments) of matching residues were located: Lsq > Here are the fragments used in the alignment Lsq > 0 A3 KFLGTWKLVSSENFDEYMKALGVGLATRKLGNLAKPRVIISK A44 Lsq > 3 DFNGYWKMLSNENFEEYLRALDVNVALRKIANLLKPDKEIVQ 44 Lsq > 0 A48 IITIRTESPFKNTEISFKLGQEFEETT A74 Lsq > 48 HMIIRTLSTFRNYIMDFQVGKEFEEDL 74 Lsq > 0 A75 ADNRKTKSTVTLARGSLNQVQKWN A98 Lsq > 77 IDDRKCMTTVSWDGDKLQCVQKGE 100 Lsq > 0 A100 NETTIKRKLVDGKMVVECKMKDVVCTRIYEKV A131 Lsq > 102 EGRGWTQWIEGDELHLEMRAEGVTCKQVFKKV 133 Make an object in which the matched residues in P2 are coloured blue and the others white. How many of the matched residues are identical in P2 and your own protein ? How much is that in percent ? There is a conserved bit sequence in lipocalins involving GxW. Where is this sequence in P2 and in your own protein ? Were these residues matched by O ? Make yet another object of your lipocalin and apply the latest operator to it. Does it look better than the initial one ? Which zones in P2 and your lipocalin do not match very well ? Is this perhaps due to insertions/deletions (specify !) ? O > mol crbp obj imp ca ; end O > lsq_obj crbp_to_p2a imp O > pai_colour cyan pai_object imp Paint> IMP 5.4 - Advanced match List the datablocks in your database whose names begin with ".LSQ": O > dir .lsq* Heap> .LSQ_INTEGER I W 3 Heap> .LSQ_RT_CRBP_TO_P2A R W 12 Heap> .LSQ_MN_CRBP_TO_P2A T W 4625 Heap> .LSQ_MM_CRBP_TO_P2A T W 13 Heap> .LSQ_VP_CRBP_TO_P2A T W 7874 * .LSQ_RT_CRBP_TO_P2A - contains the improved operator * .LSQ_MM_CRBP_TO_P2A - contains the names of the two molecules * .LSQ_MN_CRBP_TO_P2A - print this datablock to see what it is * .LSQ_VP_CRBP_TO_P2A - this datablock contains instructions for O to draw lines between matched C-alpha atoms in the two molecules * .LSQ_INTEGER - contains some parameters for the Lsq commands First, draw the object with the lines between matched atoms. This is done with the "Lsq_paired_atoms" command: O > lsq_pair Lsq > There is an matched pair called CRBP_TO_P2A Lsq > Object state ([ON],OFF) : on O > on_off Of course, if a line is longer, then the distance between the two corresponding atoms is larger (and the fit worse). The ".LSQ_VP_" and ".LSQ_MN_" datablocks take up quite a bit of space in your database. If you want to get rid of them, use the "DB_delete" or the "DB_kill" command: O > db_del .LSQ_MN_CRBP_TO_P2A Heap> Delete .LSQ_MN_CRBP_TO_P2A? [No]: yes O > db_kill .LSQ_VP_CRBP_TO_P2A Heap> Deleted .LSQ_VP_CRBP_TO_P2A You may use the asterisk (*) as a wildcard, but this is dangerous with "DB_kill" (why?) ! It's always a good idea to do "Directory" on the datablocks you want to kill with a wildcard before you actually kill them. This will show the names of all datablocks that are going to be deleted with the "DB_kill" command; this may include special datablocks you didn't know anything about (and which should therefore NOT be deleted). The ".LSQ_INTEGER" datablock contains three integer numbers which are parameters for some of the Lsq commands: * the first number is ten times the distance cut-off for "Lsq_improve": if two atoms have a distance less than or equal to this number divided by 10, then they are considered to be "matchable" * the second number is the minimum number of residues that must be in a consecutive fragment in "Lsq_improve" * the third number is the integer colour code used for drawing the lines between matched atoms in "Lsq_paired_atoms" The default values are as follows: O > wr .lsq_integer ;; .LSQ_INTEGER I 3 (8(x,i8)) 38 3 16711680 Using only O commands (i.e., no editor), change the distance cut-off to 1.0 Å and the minimum fragment length to 5 residues and repeat the "Lsq_improve" step. If no atoms match, increase the cut-off to 1.5 Å (or more if necessary) and repeat. ... Lsq > 0Search for connected fragments. Lsq > A fragment of 23 residues located. Lsq > A fragment of 9 residues located. Lsq > A fragment of 8 residues located. Lsq > A fragment of 7 residues located. Lsq > Loop = 9 ,r.m.s. fit = 0.475 with 47 atoms Lsq > x(1) = 0.7116*x+ -0.6685*y+ -0.2160*z+ 38.1557 Lsq > x(2) = 0.6047*x+ 0.7394*y+ -0.2961*z+ 54.1521 Lsq > x(3) = 0.3577*x+ 0.0801*y+ 0.9304*z+ 31.5063 Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_CRBP_TO_P2A Lsq > Here are the fragments used in the alignment Lsq > 0 A4 FLGTWKLVSSENFDEYMKALGVG A26 Lsq > 4 FNGYWKMLSNENFEEYLRALDVN 26 Lsq > 0 A36 AKPRVII A42 Lsq > 36 LKPDKEI 42 Lsq > 0 A112 KMVVECKM A119 Lsq > 114 ELHLEMRA 121 Lsq > 0 A122 VVCTRIYEK A130 Lsq > 124 VTCKQVFKK 132 Apparently, there is a core of 47 residues which match within 1 Å, with an RMSD of 0.48 Å. Now increase the cut-off to 6 Å and repeat the operation. ... Lsq > 0Search for connected fragments. Lsq > A fragment of 85 residues located. Lsq > A fragment of 46 residues located. Lsq > Loop = 3 ,r.m.s. fit = 3.126 with 131 atoms Lsq > x(1) = 0.7173*x+ -0.6745*y+ -0.1749*z+ 37.3982 Lsq > x(2) = 0.6102*x+ 0.7292*y+ -0.3098*z+ 54.4741 Lsq > x(3) = 0.3365*x+ 0.1156*y+ 0.9346*z+ 31.3459 Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_rough Lsq > Here are the fragments used in the alignment Lsq > 0 A1 SNKFLGTWKLVSSENFDEYMKALGVGLATRKLGNLAKPRVIISKKG A46 Lsq > 1 PVDFNGYWKMLSNENFEEYLRALDVNVALRKIANLLKPDKEIVQDG 46 Lsq > 0 A47 DIITIRTESPFKNTEISFKLGQEFEETTADNRKTKSTVTLARGSLNQVQK Lsq > 48 HMIIRTLSTFRNYIMDFQVGKEFEEDLTGIDDRKCMTTVSWDGDKLQCVQ Lsq > 0 A97 WNGNETTIKRKLVDGKMVVECKMKDVVCTRIYEKV A131 Lsq > 98 KGEKEGRGWTQWIEGDELHLEMRAEGVTCKQVFKK 132 Note that all residues in P2 myelin protein have been matched now, but the alignment is clearly out of register (compare with the previous alignment). When the initial alignment is very crude, it may be a good idea to start with rather lax criteria (large cut-off, small fragment size) to improve the crude alignment. Subsequently, make the criteria stricter and improve the operator again. 5.5 - How did you do ? Now go back to your old, manually derived operator. First, copy it to another datablock (the Lsq commands expect operator datablock names to begin with ".LSQ_RT_", remember?). Use the "Copy_DB" command to copy one datablock to another. Note that the first parameter is the name of the NEW datablock and the second that of the OLD datablock ! Second, reset the Lsq parameters in the ".LSQ_INTEGER" datablock to their original values. Third, improve your own operator and compare the result to the one obtained when you used "Lsq_improve" for the first time. O > copy_db .lsq_rt_manual CRBP_TO_P2 O > db_set_dat .lsq_integer 1 1 38 O > db_set_dat .lsq_integer 2 2 3 O > lsq_imp Lsq > Least squares match by Semi Automatic Alignment. Lsq > There are these transformations in the database Lsq > CRBP_TO_P2A ROUGH MANUAL Lsq > Which alignment ? manual Lsq > Given 2 molecules A,B the transformation rotates B onto A Lsq > What is the name of molecule A [MANUAL]? p2a Lsq > Zone to look for alignment [all molecule A] : Lsq > What is the name of molecule B [ ]? crbp Lsq > Zone to look for alignment [all molecule B] : Lsq > What atom [CA] ? Lsq > Number of atoms in A/B to look for alignment 131 134 ... Lsq > Loop = 5 ,r.m.s. fit = 1.270 with 125 atoms Lsq > x(1) = 0.6795*x+ -0.7158*y+ -0.1610*z+ 38.1367 Lsq > x(2) = 0.6527*x+ 0.6900*y+ -0.3130*z+ 53.8917 Lsq > x(3) = 0.3351*x+ 0.1076*y+ 0.9360*z+ 31.6551 Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_manual Lsq > Here are the fragments used in the alignment Lsq > 0 A3 KFLGTWKLVSSENFDEYMKALGVGLATRKLGNLAKPRVIISK A44 Lsq > 3 DFNGYWKMLSNENFEEYLRALDVNVALRKIANLLKPDKEIVQ 44 Lsq > 0 A48 IITIRTESPFKNTEISFKLGQEFEETT A74 Lsq > 48 HMIIRTLSTFRNYIMDFQVGKEFEEDL 74 Lsq > 0 A75 ADNRKTKSTVTLARGSLNQVQKWN A98 Lsq > 77 IDDRKCMTTVSWDGDKLQCVQKGE 100 Lsq > 0 A100 NETTIKRKLVDGKMVVECKMKDVVCTRIYEKV A131 Lsq > 102 EGRGWTQWIEGDELHLEMRAEGVTCKQVFKKV 133 If the results are identical you may congratulate yourself ! You are now an accomplished protein aligner, both using your eyes and hands and using the tools in O ! There is one more Lsq command of interest: "Lsq_molecule". This applies an operator to a molecule's coordinates. Explain the difference between "Lsq_object" and "Lsq_molecule" (remember the difference between painting an object and painting a molecule!). Be aware of the fact that "Lsq_molecule" really changes the coordinates ! Therefore, read in the structure of your lipocalin again, but give it a different name. Apply the best operator to it and draw a C-alpha trace and a zone. Save these coordinates in a new PDB file (use the "Sam_atom_out" command and answer the questions that O asks you). O > s_a_i crbp.pdb crp2 ... Sam> Molecule CRP2 contained 248 residues and 1236 atoms O > lsq_mol Lsq > Apply a transformation to an existing molecule. Lsq > There are these transformations in the database Lsq > CRBP_TO_P2A ROUGH MANUAL Lsq > Which alignment [<CR>=abort operation] ? manual Lsq > There these molecules in the database Lsq > ALPHA BETA DI P2A CRBP CRP2 Lsq > Which molecule? [<CR>=abort operation]: crp2 Lsq > Define a zone in which to apply the tranformation [all]: O > sam_atom_out Sam> Output file name: crp2.pdb Sam> Coordinate file type assumed from file name is PDB Sam> What molecule [CRBP ]: crp2 Sam> Residue range [all molecule]: Sam> Define cell constants [ 1.00 1.00 1.00 90.00 90.00 90.00]: Sam> Write out only selected atoms? [No]: Sam> Use the B-factor? [Yes]: Sam> Use the occupancy? [Yes]: Sam> 1236 atoms written out. O > mol crp2 obj crca ca ; end O > obj crzo zo ; end Finally, a useful hint: when you use the Lsq tools in O, you probably would like to have a copy of the output for later use (e.g., to stick into your laboratory notebook). Use the "O_setup" command to toggle the use of a log file (and more) on and off: O > o_setup As2> ECHO State (on/[off]) : off As2> LOG_COMMANDS State (on/[off]) : on As2> LOG_LISTING State (on/[off]) : on If the echo is on, all commands in macros will be echoed to the terminal window. If command logging is on, all commands are written to a file called "o_log.fmt"; if output logging is on, all O output will also be written to a file called "o_log.lst". When you are writing macros, this command is also useful: just switch on the command logging, type all the commands that you want to put into the macro eventually, copy the log file to a new macro file, and edit it. 5.6 - Always asking questions (1) What are the parameters for the following commands: lsq_explicit, copy_db, sam_atom_out, db_kill, lsq_molecule ? (2) Explain why "paint_zone" / "paint_object_zone" can be compared to "lsq_molecule" / "lsq_object". (3) What is a unitary rotation matrix ? (4) Define the RMS distance of two aligned protein structures. (5) If you have eleven matched atom pairs, then of which have a distance of 1.0 Å and one being 6 Å away from its "partner", what is the RMSD ? And the average distance ? (6) What percentage of matched residues in P2 and your lipocalin (using the default parameters for "lsq_improve") is identical in both proteins ? (7) What does the ".LSQ_MN_*" datablock contain ? (8) What is the difference between "db_delete" and "db_kill" ? Which one is potentially dangerous ? Why ? (9) Which O command should you execute in order to change the distance cut-off for "lsq_improve" to 3.5 Å ? (10) What is the number of matched atoms, their RMSD and their percentage sequence identity between P2 and your lipocalin if you use distance cut-offs of 1.0, 3.8 and 6.0 Å ? (11) Pool your superposition results with those obtained by the other students. Try to rank the various lipocalins according to their similarity to P2 myelin protein. 6.0 - Superficial voids In this chapter you will learn how to handle "maps" (functions of three variables calculated on a grid) by generating and looking at different representations of a protein's surface. New O commands: CPK_object Map_Active_C Map_Draw Map_File Map_Object Map_Paramete Rot_Tran_Obj Your notes: 6.1 - Speleology Lipocalins bind and transport lipids (fatty compounds). Read in the structure of the ligand that is bound inside P2 myelin protein (some fatty acid) from a file called "faa.pdb" (this file should be in your work directory); draw a zone object of the ligand and centre on the middle of this molecule. Draw all residues in P2 myelin protein which are within 10 Å of this point. Which of these do you think might interact with the hydrophilic head of the ligand, and which ones with its hydrophobic tail ? O > s_a_i faa.pdb faa O > mol faa zo ; end O > sa_li faa Sam> Name Type From To Centre Radius Sam> A200 FA 1 20 46.39 67.66 37.36 6.53 O > ce_zo faa a200 ; As4> No object defined. As4> FAA A200 A200 FAA As4> Centering on zone from A200 to A200 O > mol p2a sphere 10 end If your lipocalin contains a ligand, investigate that ligand in the same manner. Now draw a sphere of residues in your lipocalin, after it has been put on top of P2 myelin protein (using the best operator obtained in the previous chapter). Are there any side chains which are more or less conserved in the two structures (i.e., they occupy a similar position in space and are of a similar nature, e.g. both aromatic or both acidic) ? If your ligand is amphophilic, is it bound in the same way as the fatty acid in P2 (i.e., are the hydrophilic heads oriented in a similar way) ? If your lipocalin does not contain a ligand, do you think that it could bind the fatty-acid ligand of P2 ? Would it be in the same position, or can you find a better way (with less steric hindrance and/or better ligand-protein contacts) ? If you see any "red crosses" inside the cavity of your lipocalin, what do you think these represent ? Check your answer. Is there anything else inside the cavity (salt ions, metals, solvent) ? What sort of commands would you need to have in O in order to answer all these questions more easily ? (Most of the commands you can think of will probably be among the ones to be introduced in the next chapter.) 6.2 - Surfaces There are two ready-to-use O macros for calculating and displaying protein surfaces: * omac/make_vdwsurf.omac - calculates and displays a van der Waals surface * omac/make_surface.omac - calculates and displays a solvent-accessible surface, using a probe radius of 1.4 Å Discuss the difference(s) between these two types of surface. Execute the macros one at a time for P2 (they may take a little while to complete): O > @omac/make_vdwsurf.omac O > Macro in computer file-system. O > Which molecule ? p2a ... >>> CONVERGENCE <<< Last change (A3/%) : ( 2.483E+00 1.786E-02) Nr of volume calculations : ( 3) Average volume (A3) : ( 1.392E+04) Volume corresponds to a sphere of radius (A) : ( 1.492E+01) Standard deviation (A3) : ( 2.000E+01) ... As4> ... Surface map file is surface_vdwmol.map You should now have a red, dotted object called "VDW" on the graphics display. Zoom out a little and make the slab a bit smaller. Check if the fatty acid touches the van der Waals surface. O > @omac/make_surface.omac O > Macro in computer file-system. O > Which molecule ? p2a ... >>> CONVERGENCE <<< Last change (A3/%) : ( 2.378E+01 8.554E-02) Nr of volume calculations : ( 4) Average volume (A3) : ( 2.779E+04) Volume corresponds to a sphere of radius (A) : ( 1.879E+01) Standard deviation (A3) : ( 2.263E+01) ... O > As4> ... Surface map file is surface_surfmol.map Now you should have an additional object called "SURF" (coloured blue). This represents the surface accessible to a water molecule (assuming it has a radius of 1.4 Å). The macros run an external program called "VOIDOO". This program reads your structure and a library of van der Waals radii. It then creates a three-dimensional grid. Points on this grid are set to zero if they are outside the surface of interest, and to one if they are inside it. The macro then executes another program which converts this grid into a so-called "map" which can be read and displayed by O. The maps are stored in separate files: * surface_vdwmol.map contains the van der Waals surface * surface_surfmol.map contains the water-accessible surface What do you notice when you compare the two volumes and when you compare the two objects ? 6.3 - More superficiality It is handy to have macros to draw the surfaces. Type the following macro into a file: ! draw the van der Waals surface ! and the water-accessible surface ! of P2 myelin ! del vdw surf ; ! map_file surface_vdwmol.map map_obj vdwp2 map_par 20 20 20 0.01 red ; 1 map_act map_draw on_off ! map_file surface_surfmol.map map_obj surfp2 map_par 20 20 20 0.01 blue ; 5 map_act map_draw on_off ! bell message Done Now centre on one of the atoms of the ligand and execute the macro. Which atoms are sticking out of the water-accessible surface ? Any suggestions as to what this implies ? There are a number of O commands related to displaying maps: * "Map_file" - defines the file which contains the map in a form that can be read by O * "Map_object" - defines the name of the graphics object (containing -a part of- the map) that you want to create * "Map_parameters" - sets some parameters used for drawing the map: - how far does the map have to be drawn in X, Y and Z (three numbers, in Å) ? - at what level should the map be drawn (for surfaces, a number between zero and one) ? - in what colour ? - what should the intensity at the front and at the back be (just use the default values) ? - how should the map be drawn ? Here you may enter a number from 1 to 5: 1 = with solid lines, 2 = dashed, 3 = dotted, 4 = dot-dashed, 5 = as a semi-transparent surface (for this type, there are only a few colours) * "Map_active_centre" - this means, take the coordinates of the current centre of the screen as the centre of the map object * "Map_draw" - actual drawing of the map There are more Map commands, but they are not used in this tutorial, so there's no need to worry about them. Map objects can be deleted with the "Delete_object" command, just like any other type of object. There are two more O commands which may be useful when you are looking at surfaces, cavities and maps: * "Rot_tran_obj" - applies an operator to a non-molecular graphics object (e.g., a map) * "CPK_object" - takes a molecular graphics object and produces a space-filling (CPK) model of it If you have time to spare, you might like to try these commands as follows: * calculate the solvent-accessible surface around your original lipocalin structure; apply the best operator from chapter 5 to display this surface on top of P2 myelin protein * create a "CPK_object" of the fatty acid in P2, delete the water-accessible surface and re-draw it with line type 5; now re-evaluate the question about which atoms stick out of this surface. 6.4 - What now ? (1) Explain what a van der Waals surface is. What is the difference with a solvent-accessible surface ? (2) Calculate the van der Waals volume of your lipocalin and compare to P2. Pool your results with those of the other students and rank the various lipocalins according to volume. Your notes: 7.0 - Get out the yard-stick ! In this chapter you will learn how to do simple quantitative (distances etc.) and qualitative (hydrogen bonds etc.) analyses of (protein) structures with O. New O commands: Angle_define Centre_next Centre_previ Db_statistic Dist_define Hbonds_all Hbonds_mc Neighbour_at Neighbour_re Pep_Flip PhiPsi RSC_fit Trig_reset Your notes: 7.1 - What's your angle ? In the previous chapter, while you were looking for protein residues interacting with a ligand, you probably would have liked to have had some O commands to measure distances, to find hydrogen bonds, to identify nearby residues, etc. Such commands are available in O; they are part of the "Trig" (for "trigonometry") major menu. Add this major menu to the menu on your graphics window: O > menu trig on white O > on_off O > save The two simplest commands are "Dist_define" and "Angle_define". When you activate "Dist_define", O expects you to pick two atoms. It will then make a small graphics object called "DISTANCES" which contains a line between the two atoms you picked, plus their distance (in Å). Try this command, and measure a couple of distances involving the oxygen atoms of the fatty-acid ligand inside P2. If you want to hide the "DISTANCES" object temporarily, you can click it on and off. If you want to remove this object, use the command "Trig_reset". The "Angle_define" command is similar to the "Dist_define" command, except, of course, that you have to pick three, rather than two, atoms. Measure some angles involving atoms which might be involved in hydrogen bonds between the protein and the ligand. In P2 myelin protein, there are two cysteine residues which are fairly close together. What are their residue names and types ? (Remember that cysteine residues may have several different names in PDB files; which ones do you know ?) Investigate whether or not they form a disulfide bridge. If they do, the following criteria should be satisfied ("i" and "j" are the residue names of the two cysteines): * both angles S-gamma(i)-S-gamma(j)-C-beta(j) should be ~103 - 108 ° * both angles S-gamma-C-beta-C-alpha should be ~110 - 116 ° * both angles C-beta-C-alpha-N should be ~110 ° * the distance C-alpha(i)-C-alpha(j) should be ~4.5 - 6.5 Å * the distance C-beta(i)-C-beta(j) should be ~3.5 - 4.5 Å * both distances C-beta-S-gamma should be ~1.8 - 1.9 Å * both distances S-gamma(i)-C-beta(j) should be around ~3 Å * the distance S-gamma(i)-S-gamma(j) should be ~2.0 - 2.1 Å Measure these angles and distances for the pair of cysteines in P2. What is your conclusion ? Which criterion should you check first in order to see immediately if two cysteines can be in a bridge at all ? How many cysteines does your own lipocalin contain ? If there are two or more, investigate whether or not any pair of them forms a disulfide bridge. 7.2 - Neighbours You should have two commands on your menu now which can be used to find out which residues/atoms are close to a particular atom or residue. First, centre on one of the oxygens of the fatty acid in P2. Then activate the "Neighbour_atom" command and click on both oxygens in turn. Note that the command stays active (how do you know this ?); you must switch it off explicitly with "Clear_flags" or "Trig_reset". Which atoms are neighbours of the two oxygens ? Deactivate the command, and activate the "Neighbour_residue" command instead. There is a distance cut-off which determines whether or not an atom is considered to be a neighbour. Naturally, the value of this cut-off is stored in your database, namely in the ".TRIG_REAL" datablock. What is the default value ? Change the value to 3.5 Å while the "Neighbour_residue" command is active. Which additional residues are now considered to be neighbours of the oxygen atoms of the ligand ? Which atoms in the aliphatic part of the ligand have neighbours in the protein ? What type of residues are these neighbours in ? 7.3 - Interactive Three types of non-covalent interaction are very common in protein structures: * salt links (charged interactions) * hydrophobic contacts * hydrogen bonds (polar interactions) A salt link can only occur between two charged groups (one must have a positive charge, the other a negative one). Roughly, the distance between the atoms must be ~2.5 to 3.2 Å. Which amino-acid types can form salt links ? Hydrophobic contacts are very common, especially in the core of proteins (why is this?), but also at the interface between proteins, or a protein and another compound which contains hydrophobic groups. A special type of hydrophobic contact is aromatic ring stacking. Hydrogen bonding is an extremely important type of interaction; it is one of the major factors that stabilizes protein (secondary) structure ! A hydrogen bond involves a hydrogen bond donor (a polar atom with at least one hydrogen atom attached to it, e.g. the amide N-H, water H-O-H, hydroxyl O-H) and an acceptor (a polar atom with at least one lone pair of electrons available for binding the hydrogen, e.g. carbonyl =O>, water <O). Hydrogen bonds are often classified as main chain - main chain (MC-MC), main chain - side chain (MC-SC) or SC-SC. Make a list of all amino-acid types that have side chain atoms which could form hydrogen bonds; classify them as donors or acceptors. Which of them are both ? Why are these three types of interaction called "non-covalent" ? Name one example of a covalent interaction that may occur in proteins. Try to order the three interaction types by the strength of the interaction (stabilisation) and the specificity of the interaction. Now analyse the interactions between the fatty-acid ligand and P2 in detail. Classify them as hydrophobic, charged or polar. There are two O commands which can be of help in finding hydrogen bonds: "Hbonds_mc" and "Hbonds_all". The former draws the MC-MC hydrogens bonds for a zone of residues, the latter draws all hydrogens bonds for a zone of residues. Use the "Hbonds_mc" command to draw all MC-MC hydrogens bonds for the zone A15 to A35 (the two helices). Make a list of the bonds that are drawn. Can you find any patterns ? It probably helps if you make the list as follows: Residue (I) I+1 I+2 I+3 I+4 Other Asn A15 - - - - O-N - Phe A16 - - - - O-N - ... O knows which atom types there are in a protein main chain. If you want to look at hydrogen bonds involving side chain atoms (or non-protein atoms), you'll have to provide O with a dictionary file. There is a ready-made one for proteins, called "odat/residue_dict.o". Have a quick look at the contents of this file so you know how hydrogen bond donors and acceptors are defined. After that, read in the file with the good old "Read_formatted" command. Now use the "Hbonds_all" command on the same zone of residues as before. Which MC-SC and SC-SC hydrogens bonds are drawn ? Add them to your list. Are there any intra-residue hydrogen bonds (i.e., involving two atoms in the same residue) ? And can you find hydrogen bonds between the two helices ? O > read odat/residue_dict.o O > hbo_all p2a a15 a35 Create a graphics object which contains the zone A48 to A65 in P2 (two sequential anti-parallel beta-strands). Use the "Hbonds_mc" command again, and make a list of interactions. Can you detect any patterns ? Use the "Hbonds_all" command again. Which new bonds show up ? Check the SC-SC interactions between residues A52, A54 and A61. Do you think that these are really hydrogen bonds, or something else ? Which interactions stabilise the turn between the two strands ? 7.4 - The main chain stays mainly in the plane Since bond lengths and bond angles of the main chain of a protein vary extremely little between residues and between proteins, the conformation of the main chain can be described very well by considering only rotations around the three different types of bonds that it contains: around N-C-alpha, around C-alpha-C and around C-N. These rotations can be quantified by calculating so-called dihedral angles (also called twist angles or torsion angles). In the case of the rotation around C-alpha-C, for instance, you need to take into account their direct neighbours, i.e. N (next to C-alpha) and N(i+1) of the next residue (next to C). The dihedral angle is defined as the angle between the two planes through N/C-alpha/C and C-alpha/C/N(i+1), respectively ("dihedral" means "of two planes"). The three types of dihedral angle have special names: N-C-alpha dihedral (C(i-1),N,C-alpha,C) PHI ([[Phi]] [[phi]]) of residue i C-alpha -C dihedral (N,C-alpha,C,N(i+1)) PSI ([[Psi]] [[psi]]) of residue i C-N dihedral (C-alpha,C,N(i+1),C-alpha(i+1)) OMEGA ([[Omega]] [[omega]]) of residue i The Omega angle describes the rotation around the C-N(i+1) bond, but since this is the peptide bond, there is no free rotation around this bond (explain this; remember tautomerism). Therefore this dihedral has a fixed value. It can be either zero or 180 °; in one case you have a cis-peptide bond, in the other a trans-peptide bond. Which one is which ? It turns out that in proteins almost all peptide bonds are trans. Only prolines sometimes occur (as residue "i+1") in a cis-peptide. Can you explain why cis-peptides are not favourable for most residues ? And why it doesn't make such a big difference in the case of prolines ? In case of a cis-peptide X-Pro (where "X" may be any type of residue), which of the two residues has an Omega angle of 0° and which has an Omega angle of 180° ? The Phi and Psi dihedral angles can be measured with O if you activate the "PhiPsi" command. Whenever you click on an atom, the dihedral angles of the corresponding residue will be displayed at the second line from the top of the graphics window. When you have had enough, you must deactivate the command with "Clear_flags". Make a C-alpha trace of P2 and measure the Phi and Psi dihedral angles for all residues in the zones A15 to A35 and A48 to A65. Draw them in the figure on the next page, using the following legend: * a small square to mark glycine residues * a small circle to mark proline residues * a small cross for all other residue types What is the trend that you observe ? Can you explain this ? The drawing you have made is called a Ramachandran plot. It turns out that most protein residues have combinations of Phi and Psi which fall into distinct regions, depending on the type of secondary structure element that the residue is in. There is a quicker way to get a plot and a listing of the Phi/Psi angles: go out of O (or use another terminal window) and use a precooked script (a "Unix macro", if you like) called "omac/make_rama.csh": unix > omac/make_rama.csh Name of your PDB file ? p2a.pdb Making Ramachandran plot for p2a.pdb ... The listing is in file p2a_rama.list The plot file is p2a_rama.ps There are two new files: a text file with the Phi and Psi values ("p2a_rama.list") and a plot file in so-called PostScript format ("p2a_rama.ps"). Look at the plot by typing the following: unix > gs p2a_rama.ps Initializing... done. Loading Times-Roman font from /public/gnu/gs241/fonts/ptmr.gsf ... 626416 622527 0 done. >>showpage, press <return> to continue<< quit Ghostscript 2.4.1 (4/21/92) Copyright (C) 1990, 1992 Aladdin Enterprises, Menlo Park, CA. All rights reserved. Distributed by Free Software Foundation, Inc. Ghostscript comes with NO WARRANTY: see the file LICENSE for details. GS> unix > The Ramachandran plot is also one of the quickest methods of getting an impression of how good a protein structure is. In good structures, more than ~90 % of the non-glycine residues should be in one of the regions outlined in the PostScript plot. Very poor structures have their residues scattered all over the plot. One may have a couple of non-glycine residues outside these regions, but there has to be a sound structural reason for their being there (e.g., the residue is in a special type of turn, or it makes an important hydrogen bond, or it has to point away from another residue or a ligand). Make Ramachandran plots for P2 and for your own lipocalin and compare them. Which structure looks better ? 7.5 - Making flippy floppy In this section and the next, you will learn about two more indicators of protein quality. Apart from this, they may also help you locate errors (OR very interesting residues !) in the structure. The first is a another main-chain-based criterion, calculated on a per-residue basis. It has to do with the orientation of the peptide. If you look at a trans-peptide bond, you may notice that it can assume two different conformations. Inspect the peptide bonds between residues A51 and A52 and between A52 and A53 in P2. Verify that they have different peptide orientations and that in both cases the Omega dihedral is ~180°. Hint: first centre on the amide nitrogen, then orient the molecule such that the C-N(i+1) bond is perpendicular to the graphics window. The angle between the C-alpha-C and the N(i+1)-C-alpha(i+1) bonds is then equal to the Omega dihedral angle. (If the dials move too fast, you may slow them down by typing: "db_set_data .dial_real 1 1 0.15"; experiment with the latter value until you are comfortable.) When a protein structure is built using crystallographic data, it is often difficult to decide on the proper orientation of the peptide. However, O contains a tool which compares the orientation of the peptide of any residue which has at least two C- and two N-terminal neighbour residues with a database of high-quality protein structures. This comparison results in a number which is called the (RMS) "pep-flip" value (expressed in Å). The higher this number is, the more unusual the orientation of the peptide is. As is the case with outliers in the Ramachandran plot, this often means that the orientation is wrong (i.e., the peptide should be "flipped"), but occasionally it merely indicates that a residue has an unusual conformation, usually for a reason. A typical pep-flip cut-off value is 2.5 Å; if a residue has a value higher than that, it should be inspected more closely. Calculate the pep-flip values for (almost) all residues in P2 myelin protein by typing: O > pep_flip p2a a1 a131 Util> P2A A1 A131 STRAND Util> Calculating zone A1 to A131 in molecule P2A , object STRAND Util> The DB is now being loaded. Util> Loading data for protein:HCAC ... Util> Loading data for protein:TLN_3 Util> Residue A3 has a pep_flip r.m.s. value of 0.66 Util> Residue A4 has a pep_flip r.m.s. value of 0.73 Util> Residue A5 has a pep_flip r.m.s. value of 2.74 ... Util> Residue A129 has a pep_flip r.m.s. value of 0.52 The first time you use this command, O loads the database of high-quality protein structures; subsequently, it starts calculating pep-flip values for all selected residues. Why don't the first and the last two residues get a pep-flip value ? The results are stored in a new datablock: O > dir *flip* Heap> P2A_RESIDUE_PEPFLIP R W 131 You can get some more information about the values in this datablock with the "DB_statistics" command: O > db_stat P2A_RESIDUE_PEPFLIP Heap> Minimum and maximum values: 0.0000 2.9999 Heap> Mean and standard deviation: 0.7465 0.5069 Now create a C-alpha trace of P2 called "FLIP" such that all residues with a pep-flip value < 2.5 Å are coloured blue and all residues with a pep-flip value >= 2.5 Å are colour ramped. Now modify the message template such that the pep-flip value is displayed when you click on an atom. Which residues have suspect pep-flip values ? Look at each of them in "close-up" and draw hydrogen bonds. Can you think of a reason for the unusual peptide orientations (consider what would happen if the peptide were flipped) ? Where are these residues in the Ramachandran plot ? In order to make it easier to go from one suspect residue to the next, you may use the "Centre_next" and "Centre_previous" commands. They allow you to go to the next atom or residue which satisfies a certain criterion. Try to find out yourself how these commands work, by just activating them and answering the questions that O asks you. Don't forget to centre explicitly on the first residue before you use "Centre_next" ! O > pai_ram Paint> Colour-ramp a property in molecule P2A Paint> Property [residue_irc] : res_pepflip Paint> Minimum and maximum value of property [0.0000 2.9999] : Paint> First colour [green] : blue Paint> Second colour [red] : red O > pai_prop Paint> Property? [atom_name]: res_pepflip Paint> Operator (< > <= >= ^= [=]): < Paint> Value? []: 2.5 Paint> Colour? [blue]: O > obj flip ca ; end O > ce_at p2a a1 ca O > ce_ne As4> Property? [atom_name]: res_pep As4> Operator (< > <= >= ^= [=]): > As4> Value? []: 2.5 As4> Centering on P2A A5 CA O > obj sph sphere ; end Now calculate the pep-flip values for your own lipocalin and compare them to those of P2. Are there reasons for the high pep-flips ? How many residues are suspect ? Are they the same as in P2 ? What is the average, minimum and maximum pep-flip ? 7.6 - Walk on the wild side chain Hitherto, you have learned about two quality indicators / trouble spotters that are based on the conformation of the main-chain. However, it turns out that most side chains also occur predominantly in only a few preferred conformations (called rotamers). Again, there is a tool in O to compare the conformation of the side chains in your protein to those observed most commonly. The command is called "RSC_fit" (RSC stands for "Rotamer Side Chain"). The result is a number for each residue (except glycines and alanines; why is that you think ?) which indicates how closely its side chain conformation resembles the most similar rotamer found in the database. Again, the higher this number, the more suspect the residue is (either an error or an unusual conformation for a reason). A typical cut-off value is 1.5 Å. Some residues with long, floppy side chains do not have distinct rotamers; in particular the ends of their side chains may be in almost any conformation. For such residues only the first few side chain atoms are used in the RSC calculation. Calculate the RSC values for P2 myelin protein as follows: O > rsc p2a a1 a131 Util> P2A A1 A131 SPH Util> The Rotamer_DB is now being loaded. Util> Calculating zone A1 to A131 in molecule P2A , object SPH Util> Best rotamer for A1 is No. 2 with rms 1.233 ... Util> Best rotamer for A5 is No. 1 with rms 0.444 Util> SCGLY is missing. Util> Best rotamer for A7 is No. 2 with rms 0.111 ... Util> Best rotamer for A131 is No. 1 with rms 0.673 O > db_sta p2a_residue_rsc Heap> Minimum and maximum values: 0.0000 3.4147 Heap> Mean and standard deviation: 0.6826 0.5759 Make an object called "RSC" which is coloured such that you can see immediately which residues have suspect RSC values. (Hint: use similar commands as for the "FLIP" object.) Look at each of them more closely and see if you can find a reason for them having an unusual conformation. Modify the message template again such that the RSC values are also displayed. Calculate RSC values for your own lipocalin and compare with P2. If you want to plot the two datablocks, first write each of them to a file and then use the "Unix macro" "omac/plot_flip_rsc.csh" (outside O) as follows: O > wr p2a_residue_pepflip flip.odb ; O > wr p2a_residue_rsc rsc.odb ; unix > omac/plot_flip_rsc.csh Name of your PEP_FLIP datablock file ? flip.odb Name of your RSC datablock file ? rsc.odb Making plot files ... Almost done ... Pep_flip plot in file flip.ps RSC plot in file rsc.ps Scatter plot in file flip_rsc.ps unix > gs flip.ps rsc.ps flip_rsc.ps Initializing... done. Loading Times-Roman font from /public/gnu/gs241/fonts/ptmr.gsf ... 866416 846118 0 done. >>showpage, press <return> to continue<< >>showpage, press <return> to continue<< >>showpage, press <return> to continue<< quit Ghostscript 2.4.1 (4/21/92) Copyright (C) 1990, 1992 Aladdin Enterprises, Menlo Park, CA. All rights reserved. Distributed by Free Software Foundation, Inc. Ghostscript comes with NO WARRANTY: see the file LICENSE for details. GS> unix > You will have three plot files: * flip.ps = pep-flip values as a function of residue number * rsc.ps = RSC values as a function of residue number * flip_rsc.ps = a scatter plot of pep-flip and RSC values. In which area of this plot do you expect to find the "good" residues ? 7.7 - Answer me, please ! (1) What are the parameters for the following commands: hbonds_mc, pep_flip, angle_define, centre_next, db_statistics, phipsi ? (2) What is the difference between "neighbour_atom" and "neighbour_residue" ? (3) Which types of salt link may occur in protein structures ? (4) Which amino acids can, in theory at least, form intra-residue hydrogen bonds ? (5) What types of hydrogen bonds are typical of helices ? And of anti-parallel strands ? (6) Compare and contrast cis- and trans-peptide bonds. (7) Where do you expect to find helical residues in a Ramachandran plot ? And residues in strands ? Does it matter whether the strands are parallel or anti-parallel ? Why (not) ? (8) Based on the Ramachandran plots, the pep-flip and the RSC values, which structure is "better", P2 or your own lipocalin ? (9) Pool your "quality control" results with those obtained by the other students. Try to rank the lipocalins according to the quality of their structures. Your notes: 8.0 - Teenage, mutant, Ninja proteins In this chapter you will learn how to mutate a protein without getting your hands dirty and how to make changes to a structure. New O commands: Flip_peptide Lego_Auto_MC Lego_Auto_SC Lego_Side_Ch Lego_loop Lego_setup Merge_atoms Move_atom Move_fragmen Move_zone Mutate_delet Mutate_inser Mutate_repla Mutate_setup Refi_setup Refi_zone Sam_init_db Sam_rename Select_off Select_on Tor_residue Trig_refresh Your notes: 8.1 - I think I'm having a fit ! In chapter 5 you superimposed your lipocalin structure on the structure of P2 myelin protein. You also saved the coordinates with the "Sam_atom_out" command. Read these coordinates in again, but call the molecule "MUTA". Draw a zone object which only contains the amino-acid residues in MUTA (no solvent molecules or ligands). O > s_a_i crp2.pdb muta ... Sam> Molecule MUTA contained 248 residues and 1236 atoms O > mol muta obj muta O > zo 1 134 end O > ce_zo muta 1 134 As4> MUTA 1 134 MUTA As4> Centering on zone from 1 to 134 Also read in a copy of the fatty-acid ligand of P2, but call the molecule "LIGA". Draw this ligand. O > s_a_i faa.pdb liga Sam> File type is PDB Sam> Database compressed. Sam> Space for 132619 atoms Sam> Space for 10000 residues Sam> Molecule LIGA contained 1 residues and 20 atoms O > mol liga zo ; end O > ce_zo liga a200 ; As4> No object defined. As4> LIGA A200 A200 LIGA As4> Centering on zone from A200 to A200 You are going to try and "re-design" your lipocalin such that it might bind the fatty acid in its present conformation. First, you have to get rid of any bad contacts that may exist. Use the command "Move_zone" to do this (what, do you think, is the difference between "Move_object" and "Move_zone" ?). Hit "Yes" if you are satisfied, "No" if you screw up completely. Use all the commands that you need ("Neighbour_residue", "Dial_next", etc. etc.). You will need to activate "Trig_refresh" as well. This command, while active, will update the distances (e.g., those shown through the "Neighbour_residue" command) as you move the ligand inside the cavity. O > mo_zo liga a200 ; Mnp> No object defined. Mnp> LIGA A200 A200 LIGA Mnp> Fragment pivot point: 46.696 68.219 37.927 O > O > O > O > O > O > Trig> Neighbour list truncated. Trig> Neighbour list truncated. Mnp> Coordinates updated Hopefully, you have succeeded in removing (most of) the bad contacts. But there is more you can do ! You can introduce mutations in your lipocalin which remove remaining bad contacts and/or introduce new favourable contacts. In the latter category, the salt link and hydrogen bonding partners of the two oxygens are of particular interest. Check which residues are close to these atoms and consider if you could introduce better contacts by changing one or more of these. Remember that residues with long, floppy side chains can more easily assume particularly favourable conformations than more rigid residues. The following example demonstrates how to mutate Gln 128 in CRBP to an arginine. The relevant O command is "Mutate_replace", but first use "Mutate_setup". Re-draw the MUTA object after the mutation has taken place. O > mut_setup Mut> Auto-build side chains ([Y]/N)? y O > mut_repl Mut> Mutate a molecule by replacing one residue type Mut> by another. Mut> Molecule ([LIGA ]) : muta Mut> Residue name and new type (<cr> to end) : 128 arg Mut> Residue name and new type (<cr> to end) : Mut> There are 1 mutations Mut> The Rotamer_DB is now being loaded. O > mol muta zo 1 134 end If you do this, you should get a new arginine which is coloured magenta. The new residue has been given the side chain conformation of the most frequent rotamer observed in the database. This is not necessarily the best-fitting one. You may use the "Lego" commands to select another rotamer. First, use "Lego_setup", then activate "Lego_side_chain": O > lego_setup Lego> Define Proleg Paramaters. Lego> Drawing of Ca traces ([on]/off): Lego> Good-bad fit colour ramping [ 120 40]: Lego> File of Diagonal Distances [/nfs/taj/alwyn/o/data/dgnl.o]: Lego> Directory containing Protein Database [/nfs/taj/alwyn/o/data/]: Lego> File of side chain rotamers[/nfs/taj/alwyn/o/data/rsc.o]: O > lego_si_ch muta 128 Lego> MUTA 128 CA MUTA The bottom-right dial can now be used to browse through the "catalogue" of arginine side chains ! When you find the best-fitting one, hit "Yes"; if you want to keep the original one, hit "No". Check if there are more residues that need to be replaced (e.g., charged residues pointing toward the aliphatic tail of the ligand) and mutate them. Save your database regularly ! If you want to apply a lot of mutations, check out the file "omac/mutator.odb" ! 8.2 - More mutations There are two more mutation-related O commands that you may need: "Mutate_delete" and "Mutate_insert". To experiment with deleting residues, find a residue in your MUTA molecule which is in a loop or turn. Delete this residue as follows: O > mu_del Mut> Mutate a molecule by deleting residues Mut> Molecule ([MUTA ]) : Mut> Residue name (<cr> to end) : 112 Mut> Residue name (<cr> to end) : Mut> There are 1 mutations O > zo 1 134 end Oops - a break in the chain ! You can restore reasonable geometry of the protein with another Lego command: "Lego_loop". However, this command requires you to select a zone of residues before you activate the command. This is done with the "Select" commands in O, in this case "Select_on" and "Select_off". Just apply the following "recipe" to your case: * first use "Select_on" to select the entire MUTA molecule * then use "Select_off" to deselect the two (or four) neighbours of the deleted residue * centre on one of the neighbours of the deleted residue and zoom in * use "Lego_loop" and supply a zone which extends two residues at each side of the zone that you just deselected * all the "hits" from the database are displayed; the one with the best fit is drawn with side chains, the others are shown as C-alpha traces. Use the bottom-right dial to flip through the hits; select the one you like best and hit "Yes" to accept it (or hit "No" if you don't like any of them) * use "Select_on" to select the entire MUTA molecule again * redraw the MUTA object O > select_on Sel> What molecule [MUTA ]: Sel> Residue range [all molecule]: O > sel_off muta 110 114 O > lego_loop muta 108 116 Lego> MUTA 108 116 MUTA Lego> Used. Lego> Used. Lego> Not used. Lego> Not used. Lego> Not used. Lego> Not used. Lego> Used. Lego> Used. Lego> Number of selected atoms in zone is 4 Lego> The DB is now being loaded. Lego> Loading data for protein:HCAC Lego> Loading data for protein:PA ... Lego> Loading data for protein:TLN_3 Lego> DGNL> Top matches Lego> Protein Start Res. Score Sequence Lego> TLN_3 22 0.322 TTYSTYYY Lego> SGA_2 141 0.543 LFAGSTAL Lego> APP_2 276 0.555 PSGDGSTC Lego> RHD_1 98 0.565 YNGDDLGS ... Lego> SN3_1 6 0.748 VKKSDGCK Lego> APP_2 10 0.777 PTANDEEY O > yes O > zo 1 134 end 8.3 - Cleaning up Compare the loop with that in the old, non-mutated structure (using the graphics and the Lsq commands). Calculate pep-flip and RSC values for both: O > pep_flip muta 108 116 Util> MUTA 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule MUTA , object CRP2 Util> Residue 108 has a pep_flip r.m.s. value of 0.55 Util> Residue 109 has a pep_flip r.m.s. value of 0.75 Util> Residue 110 has a pep_flip r.m.s. value of 1.27 Util> Residue 111 has a pep_flip r.m.s. value of 2.31 Util> Residue 113 has a pep_flip r.m.s. value of 1.33 Util> Residue 114 has a pep_flip r.m.s. value of 1.24 Util> Residue 115 has a pep_flip r.m.s. value of 0.54 Util> Residue 116 has a pep_flip r.m.s. value of 0.45 O > pep_flip crp2 108 116 Util> CRP2 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule CRP2 , object CRP2 Util> Residue 108 has a pep_flip r.m.s. value of 0.64 Util> Residue 109 has a pep_flip r.m.s. value of 0.74 Util> Residue 110 has a pep_flip r.m.s. value of 0.62 Util> Residue 111 has a pep_flip r.m.s. value of 1.40 Util> Residue 112 has a pep_flip r.m.s. value of 2.76 Util> Residue 113 has a pep_flip r.m.s. value of 0.90 Util> Residue 114 has a pep_flip r.m.s. value of 0.65 Util> Residue 115 has a pep_flip r.m.s. value of 0.37 Util> Residue 116 has a pep_flip r.m.s. value of 0.41 O > rsc muta 108 116 Util> MUTA 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule MUTA , object CRP2 Util> Best rotamer for 108 is No. 2 with rms 1.326 Util> Best rotamer for 109 is No. 2 with rms 4.112 Util> Best rotamer for 110 is No. 1 with rms 2.045 Util> Best rotamer for 111 is No. 1 with rms 4.165 Util> Best rotamer for 113 is No. 2 with rms 1.700 Util> Best rotamer for 114 is No. 1 with rms 1.811 Util> Best rotamer for 115 is No. 2 with rms 1.164 Util> Best rotamer for 116 is No. 1 with rms 1.153 O > rsc crp2 108 116 Util> CRP2 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule CRP2 , object CRP2 Util> Best rotamer for 108 is No. 2 with rms 1.325 Util> Best rotamer for 109 is No. 2 with rms 3.466 Util> Best rotamer for 110 is No. 1 with rms 0.465 Util> Best rotamer for 111 is No. 1 with rms 0.493 Util> SCGLY is missing. Util> Best rotamer for 113 is No. 3 with rms 0.242 Util> Best rotamer for 114 is No. 1 with rms 0.538 Util> Best rotamer for 115 is No. 2 with rms 0.512 Util> Best rotamer for 116 is No. 1 with rms 1.153 Clearly, the mutated loop needs some "polishing". The first thing to do is some "regularisation". This is a method to restore some proper bond lengths, bond angles and dihedrals. It can be used in O through the "Refi_zone" command (first use the "Refi_setup" command): O > refi_setup Refi > Name of dictionary file [/nfs/taj/alwyn/o/data/dict_pdb.dat]: O > refi_zone muta 108 116 Refi > MUTA 108 116 CRP2 Refi > Refining zone 108 to 116 in molecule MUTA , object CRP2 Refi > 561 lines read from dictionary Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.05 3.58 3.47 Refi > Accept new coordinates? Hit *Yes/*No O > yes You will usually need to repeat this a few times until you get no more improvement: O > refi_zone muta 108 116 Refi > MUTA 108 116 CRP2 Refi > Refining zone 108 to 116 in molecule MUTA , object CRP2 Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.03 2.18 2.42 ... Refi > 0.01 1.16 1.72 Refi > Accept new coordinates? Hit *Yes/*No O > yes Now, run pep-flip again for the affected zone and look for suspect residues: O > pep_flip muta 108 116 Util> MUTA 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule MUTA , object CRP2 Util> Residue 108 has a pep_flip r.m.s. value of 0.56 Util> Residue 109 has a pep_flip r.m.s. value of 0.93 Util> Residue 110 has a pep_flip r.m.s. value of 1.04 Util> Residue 111 has a pep_flip r.m.s. value of 2.09 Util> Residue 113 has a pep_flip r.m.s. value of 1.14 Util> Residue 114 has a pep_flip r.m.s. value of 1.16 Util> Residue 115 has a pep_flip r.m.s. value of 0.82 Util> Residue 116 has a pep_flip r.m.s. value of 0.41 In this case, the results look okay. But if you have a residue with a bad pep-flip, you can experiment with the "Flip_peptide" command in O: it allows you to flip the peptide back and forth (almost reversibly): O > fli_pep muta 111 Mnp> MUTA 111 CA CRP2 Mnp> Flipping peptide of residue 111 Did you flip the right peptide bond ? How are the Phi, Psi and Omega values of the two residues affected ? Check that you actually made things better: O > pep_flip muta 111 111 Util> MUTA 111 111 CRP2 Util> Calculating zone 111 to 111 in molecule MUTA , object CRP2 Util> Residue 111 has a pep_flip r.m.s. value of 2.92 In this case, the pep-flip value got worse ! Hence, flip the peptide back again: O > fli_pep muta 111 Mnp> MUTA 111 CA CRP2 Mnp> Flipping peptide of residue 111 Now re-calculate the RSC values for the zone and look for suspect residues again: O > rsc muta 108 116 Util> MUTA 108 116 CRP2 Util> Calculating zone 108 to 116 in molecule MUTA , object CRP2 Util> Best rotamer for 108 is No. 2 with rms 1.278 Util> Best rotamer for 109 is No. 2 with rms 3.657 Util> Best rotamer for 110 is No. 1 with rms 0.569 Util> Best rotamer for 111 is No. 1 with rms 1.878 Util> Best rotamer for 113 is No. 2 with rms 1.206 Util> Best rotamer for 114 is No. 1 with rms 1.189 Util> Best rotamer for 115 is No. 2 with rms 0.502 Util> Best rotamer for 116 is No. 1 with rms 1.264 In this example, residue 109 (a tryptophan) has a very bad RSC value. Use the "Hbonds_all" command on the ORIGINAL structure to see if there was a special structural reason for this: O > hbo_all crp2 100 120 Trig> CRP2 100 120 CRP2 Trig> H-bonds from 100 to 120 Trig> No H-bond information for residue RTL Trig> No H-bond information for residue CD2 The N-epsilon1 atom in the ring interacts with the O-epsilon2 atom of Glu 111. Check if the same interaction is possible in the mutated structure. If so, leave the side chain alone, if not use "Lego_side_chain" to find a better one. 8.4 - Insertions O can also be used to introduce insertions. Use the "Mutate_insert" command to re-insert the residue that you deleted in the previous section: O > mut_ins Mut> Mutate a molecule by inserting residues. Mut> Molecule ([CRP2 ]) : muta Mut> After which residue: 111 Mut> New residue name and type (<cr> to end) : 111a gly Mut> New residue name and type (<cr> to end) : Mut> There are 1 mutations O > mol muta zo 1 134 end The inserted residue is not shown, since you don't have any coordinates for it yet. Use "Lego_loop" again to get coordinates (and "Lego_side_chain" to add a side chain in case your residue is not a glycine): O > sel_on muta ; O > sel_off muta 111 113 O > lego_loop muta 109 115 Lego> MUTA 109 115 MUTA Lego> Used. Lego> Used. Lego> Not used. Lego> Not used. Lego> Not used. Lego> Used. Lego> Used. Lego> Number of selected atoms in zone is 4 Lego> DGNL> Top matches Lego> Protein Start Res. Score Sequence Lego> BP2_1 76 0.111 SCSNNEI Lego> APP_2 258 0.137 VSISGYT Lego> TLN_3 114 0.142 FWNGSEM ... Lego> PA 101 0.361 LLSPYSY O > yes O > zo 1 134 end O > refi_zo muta 109 115 Refi > MUTA 109 115 MUTA Refi > Refining zone 109 to 115 in molecule MUTA , object MUTA Refi > R.m.s.d. in bond lengths, angles, fixed diherals Refi > 0.02 2.34 4.37 ... Refi > 0.00 0.96 1.60 Refi > Accept new coordinates? Hit *Yes/*No O > yes O > zo 1 134 end O > pep_flip muta 110 114 Util> MUTA 110 114 MUTA Util> Calculating zone 110 to 114 in molecule MUTA , object MUTA Util> Residue 110 has a pep_flip r.m.s. value of 0.72 Util> Residue 111 has a pep_flip r.m.s. value of 2.12 Util> Residue 111A has a pep_flip r.m.s. value of 0.54 Util> Residue 113 has a pep_flip r.m.s. value of 0.83 Util> Residue 114 has a pep_flip r.m.s. value of 0.34 O > rsc muta 110 114 Util> MUTA 110 114 MUTA Util> Calculating zone 110 to 114 in molecule MUTA , object MUTA Util> Best rotamer for 110 is No. 1 with rms 0.528 Util> Best rotamer for 111 is No. 1 with rms 2.719 Util> SCGLY is missing. Util> Best rotamer for 113 is No. 3 with rms 0.526 Util> Best rotamer for 114 is No. 1 with rms 0.853 After inserting, you usually have a strange numbering of the residues. You can rename them with the "Sam_rename" command: O > sam_rename Sam> What molecule [MUTA ]: Sam> Residue range [all molecule]: Sam> NEW name of FIRST residue [ ]: 1 Compare the structure after deleting and re-inserting with the starting structure. What is the RMSD of the C-alpha atoms of the residues near the mutation site ? O > lsq_expl Lsq > Least squares match by explicit definition of atoms. Lsq > Given 2 molecules A, B the transformation rotates B onto A Lsq > What is the name of A (the not rotated molecule)? crp2 Lsq > What is the name of B (the rotated molecule)? muta ... Lsq > Define atoms from CRP2 (the not rotated molecule): 108 116 Lsq > Define atoms from MUTA (the rotated molecule): 108 Lsq > Define atoms from CRP2 (the not rotated molecule): Lsq > The 9 atoms have an r.m.s. fit of 0.514 Lsq > xyz(1) = 0.9992*x+ -0.0394*y+ -0.0052*z+ 2.2780 Lsq > xyz(2) = 0.0396*x+ 0.9975*y+ 0.0579*z+ -3.6495 Lsq > xyz(3) = 0.0029*x+ -0.0581*y+ 0.9983*z+ 3.2274 Lsq > The transformation can be stored in O. Lsq > A blank is taken to mean do not store anything Lsq > The transformation will be stored in .LSQ_RT_junk 8.5 - Fine-tuning Now go back to the ligand which you were trying to fit into the cavity of your lipocalin. Put the Manip major menu on the screen (remove some other menu items that you don't use often if the menu gets too long): O > ce_zo liga a200 ; As4> No object defined. As4> LIGA A200 A200 MUTA As4> Centering on zone from A200 to A200 O > menu manip As2> [On]/off: As2> Colour? (<cr> = no change): cyan O > on Some of these commands you have already used (which ?), others are new: * "Move_atom" - move one atom at a time; this command should only be used to move water molecules and metal ions around since it is bound to seriously ruin your stereo-chemistry * "Move_fragment" - most residues are considered to consist of one or more rather rigid fragments which can be moved with respect to one another * "Tor_residue" - with this command you can change the torsion angles of a residue (including Phi and Psi) with the dials The best way to optimise the positioning of a side chain is: * use "Lego_side_chain" to give the residue a rotamer as a side chain * use "Move_zone" on only this residue; if you click twice on its C-alpha atom, this will become the "pivot point" of the rotations that you carry out * use "Move_fragment" for example to rotate rings * use "Tor_residue" to adjust long, floppy side chains * do not use "Move_atom" if you can avoid it * use "Refi_zone" to restore accidental screw-ups in your stereo-chemistry Experiment with some of these commands to further optimise the contacts between the ligand and your lipocalin. Don't forget to regularise. There is no guarantee that your mutant will actually bind the fatty-acid ligand (or that it will even fold !). To find this out you WILL have to get your hands dirty ... Making insertions before the first and after the last residue is a bit more complicated. These problems are discussed in appendix 10.3. If you have time left, check the fit of the ligand in the following way: * generate a van der Waals surface around your mutated protein, MUTA * generate a CPK object around the ligand, LIGA * check that the carbon atoms of the ligand do not penetrate the van der Waals surface8.6 - Reconstructing a structure You should do the exercise in this section only if you have sufficient time left ! It will show you how to make a model of a complete protein structure starting from merely the coordinates of a subset of the C-alpha atoms. This situation does sometimes occur in practice ! First you'll need to generate a PDB file containing only the coordinates of a subset of the C-alpha atoms of your lipocalin. Do this outside O as follows: unix > grep ATOM crbp.pdb | grep CA | grep -v TYR > ca.pdb unix > head -10 ca.pdb ATOM 9 CA PRO 1 6.008 -12.451 -13.561 1.00 19.84 ATOM 12 CA VAL 2 3.653 -9.873 -12.043 1.00 12.58 ATOM 20 CA ASP 3 2.602 -6.267 -12.523 1.00 10.97 ATOM 29 CA PHE 4 4.162 -3.536 -10.408 1.00 6.50 ATOM 41 CA ASN 5 2.594 -0.497 -12.126 1.00 7.18 ATOM 52 CA GLY 6 1.235 2.109 -9.664 1.00 4.35 ATOM 71 CA TRP 8 3.014 3.006 -3.584 1.00 3.27 ATOM 87 CA LYS 9 3.422 4.578 -0.194 1.00 6.07 ATOM 100 CA MET 10 5.604 3.204 2.570 1.00 5.70 ATOM 109 CA LEU 11 3.946 1.174 5.316 1.00 8.53 The first command takes all lines from your PDB file which contain the string "ATOM", then only the ones which contain "CA" are kept and, finally, all lines which contain "TYR" are thrown away. The remaining lines are written to a file called "ca.pdb". This leaves you with a PDB file which contains the C-alpha coordinates of all residues, except the tyrosines (you may select another residue type if you like). As you can see in the example, the C-alpha of residue 7 is missing; this is, indeed, a tyrosine. Read this new PDB file into O and call the "molecule" "MINI": O > s_a_i ca.pdb mini Sam> File type is PDB Sam> Database compressed. Sam> Space for 87687 atoms Sam> Space for 10000 residues Sam> Molecule MINI contained 131 residues and 131 atoms O > mol mini obj camini ca ; end O > ce_zo mini 1 134 As4> MINI 1 134 CAMINI As4> Centering on zone from 1 to 134 The first step is to make an empty molecule in O which has "space" (in the database) for all atoms of the residues. This is done as follows (if you are clever, you'll replace the first three steps by a single O command ...): * write out the datablock with the residue types of molecule MINI * edit this file and change the name of the datablock to "MAXI_RESIDUE_TYPE" * read the datablock in again * use the "Sam_init_db" command to "make room" for a new molecule called MAXI, using the sequence you just read in * use the "Merge_atoms" command to copy the C-alpha coordinates from molecule MINI to MAXI * delete molecule MINI (optional) O > sam_init Sam> This WILL initialise certain datablocks. Sam> Molecule name ([] to exit): maxi Sam> Database compressed. Sam> Making residue names. Sam> There are 131 residues, 1065 atoms. O > merg_atoms Sam> Merge from molecule name, and zone: mini 1 134 Sam> Merge to molecule name and start residue: maxi 1 Sam> Datablock containing transformation [<cr> identity]: Sam> 131 atoms Sam> 131 updated. O > mol maxi ca ; end The second step in reconstructing the protein is building the main chain using the C-alpha atoms as "anchor points" and scanning a database for fragments of residues with similar C-alpha atoms. This is done with the "Lego_auto_mc" command in O (watch the graphics screen when you execute this command !): O > lego_auto_mc maxi 1 131 Lego> MINI 1 131 MAXI O > obj zomaxi zo 1 131 end Oops - what are all these strange lines ? The answer is, that O doesn't yet know the coordinates of the side chain atoms. They are therefore kept at (1500,1500,1500) by default. The third step, then, is to automatically add the side chains in their most-common rotamer conformation. The O command that does this is "Lego_auto_sc": O > lego_auto_sc maxi 1 131 Lego> MAXI 1 131 ZOMAXI Lego> SCGLY is missing. ... Lego> Unable to draw the rotamers. O > obj zomaxi zo 1 131 end The "error" messages are due to glycines, so you don't have to worry about them. Do the final steps yourself: * insert the missing tyrosine residues * look for bad contacts between side chains and resolve them * regularise the molecule * compare the molecule with the original lipocalin; what is the RMSD of the C-alpha atoms ?8.7 - One more question, Your Honour (1) What are the parameters for the following commands: lego_auto_sc, move_zone, mutate_replace, sam_rename, merge_atoms, flip_peptide, refi_zone ? Your notes: 9.0 - Fancy pictures In this chapter you will learn how to produce high-quality pictures which convey particular types of information. New O commands: Db_table_res Db_create Draw_object Select_inver Select_prope Select_visib Sketch_add Sketch_auto Sketch_CPK Sketch_objec Sketch_setup Sketch_stick Sketch_type Sketch_undo Your notes: 9.1 - Sketch it ! You have (briefly) met one of the Sketch commands before (which one ?). In order to be able to play around with the various Sketch options more easily, do the following: * make a C-alpha trace of P2 and centre on it * click all other objects off (or delete them) * execute the macro "omac/sketch_setup.omac" * read the file "omac/sketch_menu.odb" * type "@picasso" What Sketch commands are there and what do they do: * "Sketch_setup" - with this command you can set parameters for each of the different sketch types. Supported types are tapeworm, ribbon, cylinder, arrow, spiral, rattler, stick, sphere and cartoon (the last three are special) * "Sketch_object" - this starts a new sketch object; you must supply its name * "Sketch_type" - define the sketch type to be used next; the type may be tapeworm, ribbon, cylinder, arrow, spiral or rattler * "Sketch_add" - type or pick a zone of residues which is to be added to the current sketch object in the type most recently selected with "Sketch_type" * "Sketch_undo" - undo the "Sketch_add" operations, beginning with the most recent one (in case you didn't like the result) * "Sketch_stick" - draws sticks around the bonds of a molecular object (e.g., a zone or C-alpha trace); if a datablock called ".STICK_RADIUS" (type Real) exists, the value in it is used as the radius for the sticks * "Sketch_CPK" - is similar to the command "CPK_object"; if a datablock ".CPK_RADII" exists, the values are taken to be the radii (in Å) for the chemical elements. Two examples of files containing such a datablock are "odat/radii.o" (gives small spheres) and "odat/cpk_radii.o" (van der Waals spheres) * "Sketch_auto" - automatically creates a macro called "@cartoon" in your database which creates a nice cartoon drawing of a zone of residues. Use "Sketch_setup" to change the details (or edit the macro afterwards) Chapter 3.4 of the O Manual contains more details about the niceties of the various sketch types. The easy and fun way to find out more about what you can do with these commands is to just play around with them. Change some parameters and see what the effect is. The sketch menu on your graphics screen makes such experimenting easy. 9.2 - On the table There are properties which are different for different amino-acid types, but identical for all amino acids of the same type. For example: Ala, His and Glu have different charges, but all Ala residues have the same charge (unless they are terminal residues). If you want to use such properties to colour your molecule, all you have to do is to make a table in a file which contains a list of all residue types and the value of that property for that residue type. For example, part of a charge table could like this: ALA 0 ARG 1 ASN 0 ASP -1 ... Such properties need not necessarily be numerical: ALA HPHOBE ARG CHARGE ASN POLAR ASP CHARGE ... The "omac" directory contains a number of such tables: O > ls omac/*tab*

omac/table_antigenicity.dat

omac/table_flexibility.dat

omac/table_generic.dat

omac/table_hydropathy.dat

omac/table_mutability.dat

omac/table_solv_access.dat

The file "omac/table_generic.dat" is a template table; copy it to your own directory and use it to make a table in which residue types are classified by their size (use the values BIG, MEDIUM, SMALL).

You can use tables to create new residue-property datablocks. This is done with the "DB_table_res" command, as follows:

O > mol p2a

O > db_table

Heap> This file must contain a residue name and value.

Heap> The value must match the param block type.

Heap> e.g. in the following it is character*6

Heap> ALA Small

Heap> VAL Medium

Heap> PHE Big

Heap> Name of residue data block:P2A_residue_hydropathy

Heap> Type of elements (C,[I],R): r

Heap> File name of residue table: omac/table_hydropathy.dat

O > db_stat p2a_residue_hydropathy

Heap> Minimum and maximum values: -4.5000 4.5000

Heap> Mean and standard deviation: -0.5000 3.1544

O > pai_ramp res_hydropathy -4.5 4.5 ; ;

O > obj hydro ca ; end

Use your own table with residue-size classifications to colour P2 and draw a C-alpha trace.

9.3 - Selections

The "Lego_loop" command is one of several commands which use the atom-selection mechanism in O. You have already encountered the "Select_on" and "Select_off" commands.

* "Select_on" is used to select a zone of residues (without changing the selection state of the other residues), or the entire molecule by supplying two semi-colons (;;) as the parameters.

* "Select_off" removes a zone from the selection (or the entire molecule by using "Select_off ;;").

* "Select_property" selects atoms or residues depending on their values of a certain property (similar to "Paint_property").

* "Select_invert" inverts the selection: all selected atoms will be deselected and the other way around.

* "Select_visible" copies the selection of the current molecule to the visibility datablock of that molecule. Only bonds between atoms which both are selected are drawn.

The following example draws a C-alpha trace of P2 plus the side chains of those residues that have a bad pep-flip or a bad RSC fit value:

O > mol p2a

O > conn o.dat

Mol> Maximum inter-residue link distance = 6.00

Mol> There were 23 residues.

Mol> 113 atoms.

O > sel_off ;;

O > sel_prop atom_name = ca on

O > sel_prop res_pepflip > 2.5 on

O > sel_prop res_rsc > 1.5 on

O > pa_zone p2a ; yellow

O > pa_prop res_pepflip > 2.5 red

O > pa_prop res_rsc > 1.5 green

O > sel_vis

O > zo ; end

O > sel_on ;; sel_vis

The following example does something similar. What exactly ?

O > mol p2a obj what conn all.dat

Mol> Maximum inter-residue link distance = 2.00

Mol> There were 23 residues.

Mol> 175 atoms.

O > sel_off ;;

O > sel_prop atom_name = ca on sel_prop atom_name = c on

O > sel_prop atom_name = o on sel_prop atom_name = n on

O > sel_prop atom_name = cb on sel_prop res_name = arg on

O > sel_prop res_name = his on sel_prop res_name = lys on O > sel_prop res_name = asp on sel_prop res_name = glu on

O > sel_vis zo ; end

O > sel_on ;; sel_vis

Try the following exercise to practice your skills:

* make a table which for each residue type contains a code about the nature of the residue (e.g., HPHOBE, CHARGE or POLAR; use no more than 6 characters per code)

* use this table to create a residue property called "NATURE" for P2 myelin protein

* now create a new datablock of type Integer, called "P2A_RESIDUE_CAVITY"; set all values to zero initially (use the command "DB_create")

* inspect the structure of P2 again and for each residue which surrounds the ligand-binding cavity set the residue property "CAVITY" to one

* create an object which contains a C-alpha trace of P2 myelin protein, plus the side chains of all hydrophobic residues which surround the ligand-binding cavity

If there are things which you would like to add to your pictures, but which you can't draw with the set of O commands that you have learned so far, you can define them in another way. O can read object descriptions that contain a set of predefined keywords (collectively called ODL, Object Descriptor Language). The most useful of these keywords are:

* begin_object, followed by the name of the graphics object

* end_object

* colour, followed by the name or code of a colour

* line_type, followed by one of the following words: dashed, solid, dotted, dash_dot

* move, followed by the X, Y and Z coordinate of a point

* line, followed by the X, Y and Z coordinate of a point

* move_atom, followed by molecule name, residue name, atom name

* line_atom, followed by molecule name, residue name, atom name

* dot, followed by the X, Y and Z coordinate of a point

* text_colour, followed by a colour code or name

* text, followed by the X, Y and Z coordinate of a point and the text string that is to be drawn

* instance, followed by the name of an object and the name of an operator datablock

Type the following into a file; draw the object by executing the "Draw_object" command, followed by the name of the file.

begin test

text_colour slate_blue

text 50 65 33 We're almost at the end !!! colour maroon

line_type solid

move 50 65 33

line 60 65 33

colour coral

dot 52 67 35

dot 54 69 37

colour orange_red

line_type dot_dash

move_atom p2a a1 ca

line_atom p2a a131 ca

colour slate_blue

move 55 75 25

line 65 75 25

move 55 75 25

line 55 75 35

text_colour medium_spring_green

text 66 75 25 X-axis

text 55 75 36 Z-axis

end_object

9.5 - Nice pictures

Some assorted hints about making good pictures (slides and plots):

* be creative with all the commands that you have learned (Paint, Sketch, Select etc.)

* if you make colour pictures, then use the colour to convey extra information. For example, if you use "Sketch_auto", the secondary structure elements will be obvious from the different representations. Therefore, colouring the residues according to secondary structure does not add any information to the picture. Instead, use "Paint_ramp" to colour the molecule from red at the N-terminal to blue at the C-terminal (this makes it much easier to follow the chain, especially if you don't use stereo)

* find a good view of your entire molecule which has as little overlap as possible and, preferably, shows (some of) the secondary structure motifs

* you can actually save such views, namely by writing out a datablock called ".GS_REAL" to a file. If you read it in again later you'll get the exact same view back (try this)

* when you make more detailed pictures, e.g. of interesting residues in an active site or ligand-binding cavity, use the Select commands to only show the interesting side chains

* always put all the steps that you took to generate a certain display in a macro (include "Read" commands to restore your views)

The final exercise: design some informative and aesthetic pictures showing, e.g., the overall structure of P2 myelin protein, or the ligand and the residues that interact with it, or other residues that you find interesting. If the result is really good, your assistant might be willing to explain how you can get a print-out of it.Your notes:

10.0 - Appendices

10.1 - Index of O commands discussed in this tutorial

The following list contains all O commands which are discussed in this tutorial plus the sections in which they were used (for very common commands not all occurrences are listed). The first section listed is always the one in which the command was added to your battery of O knowledge. The number in {curly brackets} refers to the section in the O MANUAL in which the command is explained.

O COMMAND Section(s) {Section in the O Manual}

Angle_define 7.1 {3.7.9}

Backup_DB 1.2, 3.4, 10.6.21 {3.17.3}

Bell_ring 4.1 {3.18.4}

Ca_zone 1.5, 10.3.8, 10.3.9 {3.2.3}

Centre_atom 1.6, 7.5, 10.3.10 {3.17.1}

Centre_ID 1.6, 4.2, 10.3.10 {3.15.1}

Centre_next 7.5, 10.3.10 {3.18.8}

Centre_previ 7.5, 10.3.10 {3.18.9}

Centre_Xyz 1.5, 10.3.10 {3.17.2}

Centre_zone 1.6, 2.1, 5.2, 6.1, 8.1, 8.5, 8.6, 10.3.10 {3.18.6}

Clear_flags 3.2, 7.4 {3.15.7}

Clear_ID 2.1, 3.1 {3.15.9}

Connect_File 3.5, 9.3, 10.3.15 {3.2.9}

Copy_db 5.5, 10.4.2 {3.20.3}

Cover_Sphere 2.1, 10.6.4 {3.2.5}

CPK_object 6.3, 10.6.14 {3.17.10}

Db_create 9.3 {3.19.2}

Db_delete 5.4 {3.19.3}

Db_kill 5.4, 10.4.2 {3.20.4}

Db_Set_data 3.4, 5.5, 7.5, 10.4.9 {3.19.7}

Db_statistic 7.5, 7.6, 9.2 {3.20.5}

Db_table_res 9.2 {3.19.6}

Delete_objec 2.1, 3.2, 6.3 {3.2.8}

Dial_next 5.2, 8.1 {3.17.6}

Dial_previou 5.2 {3.17.7}

Directory 1.2, 1.3, 1.4, 3.3, 5.2, 5.4, 7.5, 10.3.20 {3.19.1}

Dist_define 7.1 {3.7.1}

Draw_object 9.4, 10.3.23, 10.3.24 {3.17.5}

End_object 1.5, 2.1 {3.2.7}

Flip_peptide 8.3 {3.6.4}

Hbonds_all 7.3, 8.3, 10.3.18, 10.6.22 {3.7.7}

Hbonds_mc 7.3, 10.6.22 {3.7.6}If_yes_no 4.3 {3.18.5}

Lego_Auto_MC 8.6, 10.3.11 {3.11.4}

Lego_Auto_SC 8.6, 10.3.11 {3.11.6}

Lego_loop 8.2, 8.4, 9.3 {3.11.7}

Lego_setup 8.1 {3.11.1}

Lego_Side_Ch 8.1, 8.3, 8.5, 10.3.11 {3.11.5}

Lsq_explicit 5.3, 8.4 {3.8.1}

Lsq_improve 5.3, 5.4, 5.5, 10.6.23 {3.8.2}

Lsq_molecule 5.5 {3.8.4}

Lsq_object 5.3, 5.5 {3.8.3}

Lsq_Paired_a 5.4, 10.6.23 {3.8.5}

Map_Active_C 6.3 {3.9.5}

Map_Draw 6.3 {3.9.8}

Map_File 6.3 {3.9.1}

Map_Object 6.3 {3.9.2}

Map_Paramete 6.3, 10.3.2, 10.3.22 {3.9.3}

Menu_control 3.1, 3.2, 3.3, 4.6, 5.2, 7.1, 8.5, 10.6.7,

10.6.17, 10.6.25, 10.6.26 {3.16.1}

Merge_atoms 8.6, 10.3.11, 10.3.12 {3.14.8}

Message 4.1, 4.2, 4.6, 10.4.1, 10.4.4 {3.17.4}

Molecule_nam 1.5, 2.1, 4.1 {3.2.2}

Move_atom 8.5, 10.6.8 {3.6.1}

Move_fragmen 8.5 {3.6.3}

Move_object 5.2, 8.1 {3.6.5}

Move_zone 8.1, 8.5, 10.3.11, 10.3.12 {3.6.2}

Mutate_delet 8.2 {3.26.3}

Mutate_inser 8.4, 10.3.1, 10.3.11, 10.3.12 {3.26.4}

Mutate_repla 8.1, 10.3.1, 10.3.12 {3.26.2}

Mutate_setup 8.1 {3.26.1}

Neighbour_at 7.2, 10.6.18 {3.7.4}

Neighbour_re 7.2, 8.1, 10.6.18 {3.7.5}

No 4.3, 8.1, 8.2 {3.15.6}

Object_name 1.5, 2.1 {3.2.1}

On_off 1.5, 3.1, 3.2, 4.6, 7.1, 8.5, 10.4.4 {3.22.1}

O_setup 5.5, 10.3.19 {3.16.3}

Paint_case 2.3, 10.4.7 {3.3.3}

Paint_colour 2.3, 3.4, 5.3, 10.3.2, 10.3.24, 10.4.2, 10.6.9,

10.6.16 {3.3.9}

Paint_object 2.3, 5.3 {3.3.6}

Paint_obj_at 2.3 {3.3.8}

Paint_obj_zo 2.3 {3.3.7}

Paint_proper 2.3, 7.5, 9.3, 10.4.10, 10.4.12, 10.4.14 {3.3.1}

Paint_ramp 2.3, 5.1, 7.5, 9.2, 9.5, 10.4.13 {3.3.2}

Paint_select 10.4.5, 10.4.6 {3.3.5}

Paint_zone 2.3, 9.3 {3.3.4}

Pep_Flip 7.5, 8.3, 8.4 {3.28.1}

PhiPsi 7.4, 10.6.19 {3.7.8}

Print 4.1, 4.4, 10.4.1, 10.4.3, 10.4.9 {3.18.3}

Read_formatt 2.2, 2.3, 3.3, 4.5, 7.3, 9.5, 10.3.20, 10.4.3 {3.19.4}

Refi_setup 8.3 {3.13.1}

Refi_zone 8.3, 8.4, 8.5, 10.3.16, 10.6.24 {3.13.1}

Rot_Tran_Obj 6.3 {3.16.8}

RSC_fit 7.6, 8.3, 8.4, 10.3.13 {3.28.4}

Sam_atom_in 1.4, 5.1, 5.5, 6.1, 8.1, 8.6, 10.3.20 {3.14.1}

Sam_atom_out 5.5 {3.14.2}

Sam_init_db 8.6 {3.14.4}

Sam_list_seq 1.6, 5.1, 6.1 {3.14.3}

Sam_rename 8.4, 10.3.12, 10.3.17 {3.14.9}

Save_DB 1.2, 3.2, 3.4, 7.1, 10.6.12, 10.6.20 {3.15.8}

Screen_colou 10.3.2 {3.18.1}

Select_inver 9.3 {3.5.5}

Select_off 8.2, 8.4, 9.3, 10.4.5, 10.4.6 {3.5.2}

Select_on 8.2, 8.4, 9.3, 10.4.5, 10.4.6 {3.5.1}

Select_prope 9.3, 10.4.6 {3.5.3}

Select_visib 9.3 {3.5.4}

Sketch_auto 9.1, 9.5 {3.4.6}

Sketch_CPK 9.1, 10.4.8, 10.6.14 {3.4.8}

Sketch_objec 9.1 {3.4.2}

Sketch_setup 9.1, 10.4.15 {3.4.1}

Sketch_stick 9.1, 10.4.8 {3.4.7}

Sketch_type 9.1 {3.4.3}

Sketch_undo 9.1 {3.4.5}

Spawn 4.5 {3.16.5}

Sphere_centr 2.1, 3.4, 4.2, 6.1, 7.5, 10.6.4 {3.2.6}

Stop 1.2, 3.3 {3.15.4}

Symbols 4.4, 10.4.8, 10.6.13 {3.16.2}

Terminal_ID 4.2, 5.2 {3.15.2}

Tor_residue 8.5, 10.6.19 {3.6.6}

Trig_refresh 8.1 {3.7.3}

Trig_reset 7.1 {3.7.2}

Wait_ID 4.2 {3.15.3}

Write_format 2.2, 3.3, 3.4, 4.5, 5.2, 5.4, 7.6, 10.3.2, 10.3.24,

10.4.2, 10.4.3, 10.4.4, 10.4.9, 10.4.11 {3.19.5}

YASSPA 2.2, 2.3, 4.1, 4.3, 4.6, 5.1, 10.4.12 {3.28.3}

Yes 4.3, 5.2, 8.1, 8.2 {3.15.5}

Zone 2.1, 3.5 {3.2.4}

^ 1.5, 3.1 {1.7}

! 4.1 {1.6}

# 4.1, 4.4 {1.6}

$(symbols) 4.4 {1.6}$ (Unix) 4.4, 9.2 {3.16.6}

* 1.2 {3.19.1}

; 1.3, 2.2 {1.6}

@ 4.1, 4.2, 4.6, 9.1 {1.6}

10.2 - Inverted index

The following contains a number of keywords plus a list of O commands related to each keyword; if you find commands which are relevant to your problem, use appendix 10.1 and 10.5 to find out where they were discussed in this tutorial. An asterisk (e.g., in "mutate_*") means that several O commands, all beginning with "mutate_", are relevant.

ABORT command - no, clear_flags, *_reset

ALPHA helix - yasspa, sketch_*, paint_*, hbonds_*, phipsi

ALTER dihedral angles - tor_residue, lego_side_ch, lego_auto_sc

ALTER main chain - lego_loop, lego_auto_mc, tor_residue, flip_pep, refi_zone

ALTER position of a residue - move_fragment, move_zone

ALTER position of an atom - move_atom

ALTER sequence - mutate_*

ALTER side chain - move_fragment, tor_residue, lego_side_ch, lego_auto_sc,

refi_zone

ANGLES - angle_define, tor_residue, phipsi

BACKUP database - backup_db

BETA strand - yasspa, sketch_*, paint_*, hbonds_*, phipsi

BONDS - connect_file, select_visible, dist_define, hbonds_*, neighbour_*

CENTRE for map - map_active_centre, map_atom_centre, map_xyz_centre

CENTRE on a position - centre_*

COLOUR - paint_*

COMPARE molecules - move_object, yasspa, lsq_*

CONNECTIVITY - connect_file

DATABLOCKS - db_*, sam_*, read_formatted, write_formatted, directory

DEFAULT values - many commands, accept with Return key or ";"

DELETE residue - mutate_delete

DIALS - dial_next, dial_prev, dial_box, F6 key

DISPLAY object - end_object, ^

DISTANCES - neighbour_*, dist_define

DRAWING Ca trace - molecule_name, object_name, ca_zone, end_object

DRAWING cartoons - sketch_*, cpk_object

DRAWING map - map_*

DRAWING nearby residues - sphere_centre, cover_sphere

DRAWING residues - molecule_name, object_name, zone, end_object

DRAWING selected atoms - select_*, ca_zone, zone

END command - no, clear_flags, *_reset

ENDING O session - stop

HELIX - yasspa, sketch_*, paint_*, hbonds_*, phipsi

HYDROGEN bonds - hbonds_*, dist_define, neighbour_*

INSERT residue - mutate_insert

LABELS - clear_id, draw_object

LOG file - o_setup

MACRO - @, read_formatted, wait_id, #, !, if_yes_no, message, print,

bell_ring, o_setup

MAIN chain - connect_file, pep_flip, flip_pep, phipsi, lego_*,

tor_residue, move_fragment, move_zone

MEASURE - angle_define, dist_define, phipsi, tor_residue, neighbour_*

MOVE - move_*, dial_*, lsq_*, rot_tran_obj, flip_pep

MUTATE residue - mutate_*, lego_*, refi_*, merge_atoms

NAMES atoms - write_formatted, sam_atom_out

NAMES residues - sam_list_seq, write_formatted

OBJECT - object_name, sketch_object, draw_object, end_object,

delete_object, paint_object, move_object

OUTPUT - o_setup

PAINTING - paint_*

PEPTIDE - connect_file, pep_flip, flip_pep, phipsi

PROPERTIES - sam_atom_in, read_formatted, write_formatted, db_create,

db_set_data, db_table_res, db_statistics, select_*, paint_*

QUALITY proteins - phipsi, pep_flip, rsc_fit

QUITTING - stop

RMSD - lsq_explicit, lsq_improve

ROTATE - move_*, dial_*, lsq_*, rot_tran_obj

SALT links - hbonds_*, dist_define, neighbour_*

SAVE database - save_db, backup_db

SAVE molecule - write_formatted, sam_atom_out

SECONDARY structure - yasspa, sketch_*, paint_*, hbonds_*, phipsi

SELECT atoms - select_*

SELECT molecule - molecule_name, select_*

SEQUENCE - sam_list_seq

SIDE chain - lego_*, tor_residue, rsc_fit, move_fragment

SPHERE atoms - sphere_centre, cover_sphere

STOP - stop

STRAND - yasspa, sketch_*, paint_*, hbonds_*, phipsi

SURFACES - map_*, cpk_object

SYMBOLS - $, symbols, print, # TRANSLATE - move_*, dial_*, lsq_*, rot_tran_obj UNIX -$, spawn

VISIBILITY - select_*, ^, on_off

VISIBLE objects - on_off, ^

WRITE datablock - write_formatted

WRITE molecule - write_formatted, sam_atom_out

WRITE sequence - write_formatted

WRONG command - no, clear_flags, *_reset10.3 - Frequently Asked Questions

This section contains some questions which are asked often by beginning O users, plus answers to these questions.

10.3.1 - Why does mutate_insert/replace distort my molecule ?

O maintains a list of pointers to atoms it should display. If you add or delete atoms, these lists are maimed, resulting in a strange distortion of your drawing(s). The remedy is to redraw your molecular object(s) after the mutate operation.

10.3.2 - Is there an easy way to select fancy colours in O ?

Yes, as a temporary feature in the SGI version of O. If you want to use fancy colours for your maps and molecules, for example when you are taking slides, try the "Screen_colour" command in O. On ESVs, this enables you to change the background colour of your O window. However, on an SGI, instead of changing the background colour, this command starts an SGI utility program called "cedit". It comes up in a small window; make this window a bit bigger and use the LEFT mouse button to change the Red Green and Blue components of your colour; the resulting colour is displayed on the right. When you have found a nice colour, read the three numbers at the bottom of the window, e.g. 222 21 108. Now go back to your O terminal window, and execute a command that requires you to select a colour (e.g., "Map_par" or "Paint_colour"). Now the big trick: when O asks for a colour you DO NOT type an exotic name such as brown_nose_beige, but instead you type the three numbers and ... VOILA !

Sometimes you can't use three RGB numbers, for example inside ODL objects. In those cases, use the following trick:

O > map_par ; ;

Map> Colour? [rgb]: 35 109 162 ; ;

O > wr .map_integer ;;

.MAP_INTEGER I 2 (10(x,i7))

3648767 1

Now, O has done the conversion between RGB and single-number O colour for you; the result is stored as the first item in the datablock .MAP_INTEGER, in this case 3648767.

Also check out "omac/colour_demo.odb" and "omac/colour_code.omac".

10.3.3 - Why do the dials move so fast on my SGI ?

You may change the sensitivity of the dials; a multiplicative scale constant is stored in the datablock ".DIAL_REAL". On the SGI dials, a value of 0.15 makes the response feel like the ESV dialbox. In addition, the on-screen dial box (press F6; SGI only) can be used. Its main advantage is that, when you have several options active which use the dials, you have access to ALL dials, e.g. you can rotate your molecule and a fragment alternatively without having to do "Dial_next" and "Dial_prev" in between.10.3.4 - What is the best way to backup a molecule ?

It is best to save your coordinates in ASCII format. You can of course use the PDB format, but if you use the O format, all the extra properties you may have created (YASSPA, colours, pep-flip etc.) will also be saved. In fact, it is a good idea to make a macro to do this, and to put it on the menu.

10.3.5 - Is there a split-screen stereo, and if so, how do I access it ?

Currently, only the SGI version supports stereo. Function key F9 toggles it on and off.

10.3.6 - When I try to display a map, I get an error condition #43.

Error message #43 means: "Extent of box not within map limits". You get this error message if you try to contour the map in a volume which is not covered by the map in the file. For example, if the map covers the A molecule, and you are trying to display contours in the B molecule.

10.3.7 - How do I reset the parameters for major menu X ?

Most major menus have their constants stored in datablocks whose names start with a period and the major menu name (e.g. ".MAP_REAL"). If you delete such a datablock, it is immediately recreated, with default values. You can also read in "startup.o" and "menu.o" again. The most drastic way to reset things is to write out your molecules, and start from scratch with an empty database.

10.3.8 - When I display my C-alpha trace, some of the bonds are missing.

C-alpha-C-alpha bonds are drawn according to a maximum distance criterion. This distance is stored in the datablock _molecule_ca_mxdist. The default value is 4.2 Å. Probably your structure is not yet perfect, and you have some C-alpha-C-alpha distances that are longer than they are supposed to be (normal value is ~3.8 Å).

10.3.9 - Can I display an RNA backbone ?

Yes. Just as for proteins, you can use the "Ca_zone" command. "CA" actually stands for "Central Atom". The name of this atom defaults to C-alpha, and is stored in "_molecule_ca". Change this to "P" or "C1*", or whatever you prefer. You also have to change the "_molecule_ca_mxdist" datablock to approximately 7.5 Å. There is a datablock called "_molecule_type", which defaults to "PROTEIN". It doesn't do anything (yet ?), but you can change it to "RNA" if you like.

10.3.10 - How can I centre on a particular spot in space ?

There are various possibilities:

* centre_id (then click on an atom)

* centre_zone molecule first_residue last_residue

* centre_atom molecule residue_name atom_name * centre_xyz x y z

* centre_next property operator value

* centre_prev property operator value

Examples of using the "Centre_next" and "Centre_previous" commands:

* centre_next atom_name = ca (move to next residue)

* centre_next atom_name = o1 (move to next water)

* centre_next residue_rsfit < 0.6

* centre_prev residue_pepflip > 2.5

10.3.11 - How do I Mutate_insert after the last residue ?

(1) first you do Mutate_insert ...; say you add a Phe in this fashion

(2) then you look for a Phe which is nearby

(3) do Move_zone and double-click the C-alpha of that Phe

(4) use the dials to move this Phe into the position where your new C-terminal Phe should go; when it's in place DO NOT HIT YES !!!

(5) instead, use Merge_atoms to copy the CURRENT coordinates of the MOVED Phe to those of the NEW Phe

(6) after that, hit NO immediately, so as to cancel the Move_zone

(7) draw a new object ("zone ; end") and voila !

You can use the same mechanism to add entire helices or strands at the C-terminus. You could probably also use the Baton command to position the C-alpha atoms, then do Lego_auto_mc and Lego_auto_sc and, finally, Lego_side_ch to get the correct rotamers.

10.3.12 - How do I Mutate_insert before the first residue ?

You can do it by duplicating the "first" residue after itself using Mutate_insert and Merge_atoms. Then Mutate_replace the first residue to whatever residue it is going to be, use Sam_rename to correct the residue names, and Move_zone on the new residue:

Lys2-Gly3-Glu4---

|

Mut_insert mol 2 2a lys (makes empty entry)

Merge_atoms mol 2 2 mol 2a (copy coordinates)

|

Lys2-Lys2a-Gly3-Glu4---

|

Mutate_replace mol 2 ala

|

Ala2-Lys2a-Gly3-Glu4---

|

Sam_rename mol ; 1

Move_zone mol 1 1 (Ala1 will initially be on top of Lys2)

|

Ala1-Lys2-Gly3-Glu4---

It's safe to do a "Write_formatted mol_* save_file ;" before you start, just in case you screw up ...

10.3.13 - Why does RSC give crazy values for some residue types ?

Most often this is due to wrong atom names. Two well-known examples are the X-PLOR names "ILE CD" and the C-terminal "OT1/OT2". The quickest remedy:

* change ILE CD to CD1

* change OT1 to just O and remove OT2 (temporarily)

10.3.14 - Why does O ignore some of the commands in my macro ?

Probably because one or more of your input lines are too long. At present (931027) there is a limit of 72 characters on the length of input, BOTH from the keyboard AND from macros !

10.3.15 - Why doesn't O draw all bonds in my ligand ?

At present, there is an undocumented requirement on the contents of CONNECT records, namely that one of them MUST contain the (dummy) links "-" and "+" ! For example, to draw pentafluorobenzylalcohol correctly, i.e. without a bond between hydrogens H1 and H2, you must use the following connectivity information:

PFB

ATOM O1 C7 C1 C2 C3 C4 C5 C6 F2 F3 F4 F5 F6 H1 H2 H3

CONNECT - O1 C7 C1 C2 C3 C4 C5 C6 C1 +

CONNECT C2 F2

CONNECT C3 F3

CONNECT C4 F4

CONNECT C5 F5

CONNECT C6 F6

CONNECT C7 H1

CONNECT C7 H2

CONNECT O1 H3

10.3.16 - Does O work with nucleic acids ?

Most options work just as well with proteins as with any other type of molecule.

* Refi_zone - needs a separate dictionary

* Rsr_* works on anything that you specify as having fragments (rsr_zone) or as a rigid body

* Lego_* commands don't map well to nucleic acids; they are too floppy for the main chain database

10.3.17 - How can I give my waters/ligands a different chain-id ?

With the Sam_rename command. Suppose that residues a1 - a206 are your protein, and a207 - a226 your waters, then do: "sam_rename mol_name a207 a226 w301". Your waters will now be called w301 - w320

10.3.18 - How can I get H-bonds between protein and ligand ?

First, make an appropriate entry for your specific ligand in "residue.dict" to define hydrogen bond donors and acceptors. Then, merge your protein and your ligand into one molecule and use the "Hbonds_all" command in O.

10.3.19 - Can I get a LOG file from O ?

Yes; use the "O_setup" command.

10.3.20 - What does this INST error mean ?

Usually INST errors occur when your database is full. Check by "dir *". Remove some datablocks if necessary. When you clean up the database and when you look at the space you gain in real/integer or whatever, nothing is updated This is a feature; as soon as you Read or Sam_at_in something the database statistics will be updated.

10.3.21 - Can I ID the atom at the active centre from a macro ?

The question is: "Is there a way to ID the atom at the active centre when using Center_next or Center_prev from a macro ? After using this command a number of times, zipping through a model, it is nice to know where I am."

The following lines inside your macro should do the trick:

print You are now in molecule $.id_m print Residue$.id_r atom $.id_a 10.3.22 - How should I contour cavities and surfaces ? On SGIs, use semi-transparent surfaces ("Map_par" line type 5). Contour at a level close to 0 (e.g., 0.01) to get smooth images. 10.3.23 - How can I connect the two S-gamma atoms in a disulfide ? At present (931209), O doesn't know about disulfides. The following solution is due to Laurent Maveyraud: If it is for taking pictures, you have the possibility to draw a stick with using ODL commands. Create a file "disulfide.odl" which contains: begin_object disu ! or whatever you want colour green ! if your sulphurs are green stick molname resnum1 SG molname resnum2 SG end_object and draw this from O with "draw_object disulfide.odl". A more complicated solution is to make a new residue of just the two sulphurs and define the connectivity for them. 10.3.24 - Why doesn't sphere_atom work in my ODL file ? The following ODL file illustrates how you can make your own objects with spheres (draw with "Draw_object"): begin_object q light .3 .3 .3 surf_mode 4 4 surf_prop .2 .7 .8 90. .0 sphere_atom a a132 c1 3. 18010 end_object You need some extra instructions to generate the lighting parameters, and the colour is a packed integer. Get this value by setting paint_colour to what you want and then write out the contents of .active_colour: O > paint_colour wino_nose_red O > write .active_colour ;; 10.4 - Macros This section contains a number of O macros. You can type them in and use them, or you can use them as templates when writing your own macros or you can just study them and try to figure out how they work (and what they do). 10.4.1 - date.omac ! date.omac ! ! show current time as an O message$ echo message date > qq1

$echo print date >> qq1 @q$ /bin/rm qq1

10.4.2 - colour_code.omac

! colour_code.omac

!

! ask the user to type a colour name

! convert to an integer and show the result

!

copy_db backup_colour .active_colour

paint_colour # Enter the name of a colour > #

write .active_colour qq1 ;

$echo print Your colour is tail -1 qq1 > qq2 @qq2$ /bin/rm qq1 qq2

copy_db .active_colour backup_colour

db_kill backup_colour

10.4.3 - edb.omac

! edb.omac

!

! edit an O datablock

write #Which datablock to edit ? # temp.db ;

$xedit temp.db read temp.db$ /bin/rm temp.db

print ... all done

10.4.4 - all_on_off.omac

! macro to toggle ALL objects ON/OFF

!

on_off message Wait

write .menu qq1 ;$grep '\^' qq1 > qq2 @qq2$ \rm qq1 qq2

message Done

! colour CAs with bad pepflips

!

mol #Which molecule ?#

obj flip

sel_on ;;

pai_sel yellow

sel_off ;;

sel_prop residue_pepflip > #Cut-off (A) ?# on

pai_sel red sel_on ;;

ca ; end

10.4.6 - acid_base.omac

! acidic, basic and other residues in different colours

!

mol #Which molecule ?#

obj acid

sel_on ;;

pai_sel yellow

sel_off ;;

sel_prop residue_type = asp on

sel_prop residue_type = glu on

pai_sel red

sel_off ;;

sel_prop residue_type = lys on

sel_prop residue_type = arg on

sel_prop residue_type = his on

pai_sel blue

sel_on ;;

ca ; end

10.4.7 - cnos_colours.omac

! paints C, N, O and S in their default colours

!

pa_case atom_z 4 6 7 8 16 yellow blue red green

10.4.8 - ball_and_stick.omac

! make a ball-and-stick model of an object

!

symbol my_obj # Which object ?#

ske_stick $my_objske_cpk$my_obj

on_off

message done

10.4.9 - set_prefs.omac

! macro to set several datablock entries interactively

!

db_set_dat .molec_obj_integer 9 9 #How many IDs per object ?#

db_set_dat .molec_obj_integer 8 8 #What colour for IDs ?#

!

print 'Colour of moving atoms :'

wr .mnp_integer ;;

db_set_dat .mnp_integer 1 1 #Colour of moving atoms ?#

!

print 'Cut-off for neighbour_atom and neighbour_res :'

wr .trig_real ;;

db_set_dat .trig_real 1 1 #Cut-off (A) ?#

!

print 'LSQ cutoff (10*A), min nr residues, colour :'

wr .lsq_integer ;;

db_set_dat .lsq_integer 1 1 #Cut-off (10*A; INTEGER) ?#

db_set_dat .lsq_integer 2 2 #Min nr of residues ?#

db_set_dat .lsq_integer 3 3 #Colour for lsq_paired ?#

!

print 'Done !'

10.4.10 - paint_restype.omac

! paint a particular residue type in one colour

!

pa_prop res_type = #Which residue type ?# #Which colour ?#

10.4.11 - save_view.omac

! write current view matrix to a file

!

write .gs_real #File name ?# ;

10.4.12 - yasspa.omac

! do YASSPA on a molecule and produce colour-coded CA-trace

!

mol #Which molecule ?#

!

yasspa ; alpha 0.5

yasspa ; beta 0.8

!

obj yasspa

pa_pro res_2ry = " " yellow

pa_pro res_2ry = alpha redpa_pro res_2ry = beta green

ca ; end

on_off

10.4.13 - rainbow.omac

! draw a CA-trace coloured red -> blue from N -> C-terminus

!

mol #Which molecule ?#

obj ramp

pa_ra ; ; ; ;

ca ; end

on_off

10.4.14 - nice_residue_colours.omac

! paint residue types (from a posting on Usenet by

! Jeremy John Ahouse, Brandeis University,

! ahouse@hydra.rose.brandeis.edu)

!

mol #Which molecule ?#

!

! Red for acidic amino acids; Glu, Asp

! (since red is a common danger signal and acids are dangerous

! (well maybe not amino acids, but it's a start))

pa_prop res_type = glu red

pa_prop res_type = asp red

!

! Blue for basic amino acids; Lys, Arg, His

pa_prop res_type = lys blue

pa_prop res_type = arg blue

pa_prop res_type = his blue

!

! White for hydroxyl amino acids; Ser, Thr (as in whitewater)

! (this was not possible so I chose a cool "whitewater" color - JJA)

pa_prop res_type = ser white

pa_prop res_type = thr white

!

! Green for amide amino acids; Asn and Gln

! (since glutamine and asparagine rhyme with green)

pa_prop res_type = asn green

pa_prop res_type = gln green

!

! Yellow for sulphur amino acids; Cys, Met

! (this one's obvious)

pa_prop res_type = cys yellow

pa_prop res_type = met yellow

!

! Black for hydrophobic amino acids; Ala, Val, Leu, Ile

! (Black is the opposite of white and so if white is for hydrophilic

! hydroxyl amino acids black is a natural for hydrophobic ones)

pa_prop res_type = ala brownpa_prop res_type = val brown

pa_prop res_type = leu brown

pa_prop res_type = ile brown

!

! Orange for aromatic amino acids; Tyr, Phe, Trp

! (since "orange" sounds a little like "aromatic" and

! oranges are aromatic (if that suits you better))

pa_prop res_type = tyr orange

pa_prop res_type = phe orange

pa_prop res_type = trp orange

!

! Purple for proline; Pro

! (since both have "prl" in them)

pa_prop res_type = pro magenta

!

! Grey for glycine; Gly

! (since both start with "g" and grey is sort of blah-like glycine)

pa_prop res_type = gly grey

!

obj nice

ca ; end

10.4.15 - sketch_setup.omac

! set up sketch types

!

! tapeworm, type, width

ske_set tapeworm solid 1.0

!

! ribbon, width, lines, segments, smoothness

ske_set ribbon 1.0 7 5 2

!

ske_set cylinder solid 2.8 16

!

! arrow, type, ...

ske_set arrow solid 2.5 0.5 3.8 2 0 3 2

!

! spiral, type, ...

ske_set spiral solid 3.0 1.0 0.5 2 0 3 2

!

! rattler, type, radius edges, smoothness

ske_set rattler solid 0.5 5 2

!

ske_set stick solid .1 5

ske_set sphere smooth 2

ske_set cart spiral magenta alpha arrow cyan beta rattler yellow " "

!

sketch_setup ?

10.5 - Other O commands

This section contains an ultra-brief description of O commands that are not discussed in this tutorial. The number of the section of the O Manual in which the command is discussed is shown in {curly brackets}. Commands that you might like to experiment with are printed in bold typeface.

O COMMAND Description {Section in the O Manual}

Arith_db arithmetical combination of datablocks {3.20.2}

Baton used in crystallographic model building {3.31.1}

Bone_break_b \

Bone_draw |

Bone_make_bo |

Bone_mask | all Bone_* commands are used in crystallographic

Bone_pick_Ca | model-building {chapter 3.10}

Bone_redefin |

Bone_repeat |

Bone_setup |

Bone_skip /

Db_set_atom set values in a datablock for one atom {3.19.9}

Db_set_zone set values in a datablock for a zone {3.19.8}

Dial_box change the assignments of the dials {3.17.8}

EZD_draw \ EZD ("easy density") files are maps in a simple

EZD_map / format which can be read and drawn by O {3.28.5, 3.28.6}

Help O help facility {3.16.4}

Lego_bones used in crystallographic model building {3.11.3}

Lego_Ca used mainly in crystallographic model building {3.11.2}

Map_atom_cen define an atom as the map centre {3.9.4}

Map_cache reset map-cache memory {3.9.9}

Map_cover draw a map around an object {3.9.10}

Map_informat print information about the current map file {3.9.7}

Map_xyz_cent define a point as the map centre {3.9.6}

MR_rt \ the MR_* commands are used in Molecular Replacement

MR_setup / applications {chapter 3.30}

Oda_calc \

Oda_chop |

Oda_close |

Oda_conn | the Oda_* commands are not yet used {-}

Oda_file |Oda_setup |

Oda_talk |

Oda_zone /

Patt_cross \

Patt_move | the Patt_* commands are used for solving Pattersons

Patt_setup / {chapter 3.25}

Plot obsolete {3.24.2}

Plot_2D should not be used yet {-}

Plot_off close the current plot file {3.24.4}

Plot_on open a new plot file and start capturing objects {3.24.3}

Plot_setup set some plot parameters {3.24.1}

Rotate_pictu rotate the picture around an axis {3.17.9}

Rot_trans_db apply an operator to a point stored in a datablock {3.20.1}

RSR_contours \

RSR_dgnl |

RSR_dihedral |

RSR_map | the RSR_* commands are used in crystallographic

RSR_rigid | model building {chapter 3.12}

RSR_rotamer |

RSR_scale |

RSR_setup |

RSR_zone /

RS_fit calculate fit of observed and calculated density {3.28.2}

Seq1 ? {-}

Slider_combi \

Slider_displ |

Slider_guess | the Slider_* commands are used in crystallographic

Slider_lego | model building {chapter 3.27}

Slider_setup |

Slider_show /

Spin spin the picture around and time this operation {3.18.2}

Stereo not implemented on SGIs (use F9 or F1) {3.16.7}

Symm_cell \

Symm_object | the Symm_* commands are used in crystallographic

Symm_setup | model building {chapter 3.23}

Symm_sphere /

Template ? {3.18.7}

Trace_fill_C \

Trace_prune | the Trace_* commands are used in crystallographic

Trace_setup | model building {chapter 3.31}}

Trace_sphere /

Water_init | the Water_* commands are used in crystallographic

Water_pekpik / model building {3.14.5, 3.14.6, 3.14.7}

10.6 - Selected datablocks

This section contains a brief description of some datablocks which you may like to add or alter (or may just be curious about).

10.6.1 - .message_template

Contains the "design" of the information message that is shown at the top of the graphics screen whenever an atom is picked. See section 3.4 of this tutorial and 3.2.11 of the O Manual.

.MESSAGE_TEMPLATE T 11 40

%MOLNAM %RESNAM %Restyp %ATMNAM

|XYZ

atom_xyz

|B

atom_b

|Flip

residue_pepflip

|RSC

residue_rsc

|

residue_2ry_struc

10.6.2 - .id_template

Contains the design of the atom labels that are shown as part of the graphics object whenever an atom is picked. See section 3.4 of this tutorial and 3.2.11 of the O Manual.

.ID_TEMPLATE T 2 40

%Restyp %RESNAM %ATMNAM

residue_2ry_struc

10.6.3 - .molec_obj_integer

Item 8 is the integer colour code for atom labels. Item 9 is the maximum number of labels per object. See sections 3.4 and 10.4.9 of this tutorial and 3.1.13 of the O Manual.

.MOLEC_OBJ_INTEGER I 110 (10(x,i7))

0 17 0 0 0 3 7 3316172 10 0

25 4 2 1 1 4 2 4 3 3

3 8 5 2 1 3 1 4 4 4

...

10.6.4 - .molec_obj_real

Contains the cut-off (in Å) for drawing sphere objects. See section 3.4 of this tutorial.

.MOLEC_OBJ_REAL R 1 (10(x,f7.5))

8.00000

10.6.5 - .o-version

Contains the name of the O version which was used to create or update your current database.

.O-VERSION T 1 20

O version 5.9.2

10.6.6 - .active_centre

Contains the coordinates of the current screen centre. This is used by many commands (e.g., "Sphere_centre"), but you may manipulate it yourself as well. See section 3.2.13 of the O Manual.

.ACTIVE_CENTRE R 3 (10(x,f7.4))

60.1340 64.9440 19.6980

Contains the items occurring on the menu of the graphics screen.

Centre_ID

Clear_ID

Clear_flags

...

^what

^graph

^hbonds

10.6.8 - .moving_atom

Contains a description (using ODL; see section 9.4 of this tutorial and chapter 7 of the O Manual) of how a moving atom should be drawn. See section 3.6.7 of the O Manual.

.MOVING_ATOM T 9 20

begin moving_atom

colour magenta

M 0.2 0 0

L -0.2 0 0

M 0 0.2 0

L 0 -0.2 0

M 0 0 0.2

L 0 0 -0.2

end_object

10.6.9 - .colour_names

Contains the (English) names of the predefined colours.

.COLOUR_NAMES T 71 20

aquamarine

black blue

...

white

yellow

yellow_green

10.6.10 - .error_messages

Contains some O error messages. You could translate these into Swahili if your English is a bit rusty.

.ERROR_MESSAGES T 80 50

File type not supported

File does not exist

...

Programmer error. Email bug-report to alwyn@xray.b

10.6.11 - .dial_real

Contains a scale factor which controls the sensitivity of the physical dials. Experiment with values in the range ~0.1 - 1.

.DIAL_REAL R 1 (10(x,f7.5))

0.15000

10.6.12 - .timestamp

Contains the date and time when you last saved your database !

.TIMESTAMP T 1 24

Thu Dec 30 17:13:29 1993

10.6.13 - .symbols

Contains a list of currently defined symbols and their values.

.SYMBOLS T 20 72

MYMOL P2A

.ID_R A67

.ID_A CA

USER Gerard

.ID_M P2A

...

Contains the radii for the various elements used by "CPK_object" and "Sketch_CPK".

1.10000 1.50000 1.50000 1.50000 1.50000 1.70000 1.60000 1.50000 1.50000

...10.6.15 - .gs_real

Contains the current view matrix. If you save it in a file and read it back in later, you'll get the same view back.

.GS_REAL R 27 (8(x,f8.4))

0.1951 60.1340 64.9440 19.6980 0.1031 0.1365 0.1196 0.0000

-0.0986 0.1577 -0.0949 0.0000 -0.1524 -0.0096 0.1423 0.0000

5.7728 -16.3076 -5.0592 1.0000 0.0920 0.9820 0.0000 0.9000

0.0000 0.0000 0.0000

10.6.16 - .active_colour

Contains the integer code of the currently selected colour.

.ACTIVE_COLOUR I 1 (8(x,i8))

16711680

If you write this with format "(1x,2a)", you'll get a list of all O commands.

Centre_ID

Terminal_ID

Wait_ID

Stop

Yes

No

Clear_flags

...

10.6.18 - .trig_real

Contains the cut-off distance for considering atoms as being neighbours. See section 7.2 of this tutorial and section 3.7.11 of the O Manual.

.TRIG_REAL R 1 (10(x,f7.5))

3.50000

10.6.19 - .torsion_information

This datablock contains information about the dihedrals that can be changed with "Tor_residue". See section 3.6.6 of the O Manual for more information.

.TORSION_INFORMATION T 100 72

RESIDUE ALA

TORSION PHI -57. C- N CA C CB C O N+

TORSION PSI -47 N CA C N+ O N+

RESIDUE ARG

TORSION PHI -57. C- N CA C CB C O CG CD NE CZ NH1 NH2 N+

TORSION PSI -47 N CA C N+ O N+

TORSION CHI1 -60. N CA CB CG CG CD NE CZ NH1 NH2TORSION CHI2 180. CA CB CG CD CD NE CZ NH1 NH2

TORSION CHI3 180. CB CG CD NE NE CZ NH1 NH2

TORSION CHI4 180. CG CD NE CZ CZ NH1 NH2

...

10.6.20 - file_o_save

Contains the name of your database file.

FILE_O_SAVE T 1 72

p2.o

10.6.21 - file_o_backup

Contains the name of your database backup file.

FILE_O_BACKUP T 1 72

p2_backup.o

10.6.22 - .solid_hbonds

By default, hydrogen bonds are drawn as dotted lines. They may be drawn as a series of small, solid spheres instead, if you include this datablock. See section 3.7.11 of the O Manual.

.solid_hbonds t 1 10

0.2 7

10.6.23 - .lsq_integer

This datablock is described in section 5.4 of this tutorial (together with other datablocks used by the Lsq commands) and section 3.8.8 of the O Manual.

.LSQ_INTEGER I 3 (8(x,i8))

38 3 16711680

10.6.24 - .refi_dict

Contains the dictionary used by "Refi_zone".

.REFI_DICT T 561 36

HYDR DICTIONARY:LEVITT/HERMANS NAME

ALA -1 10 2

N 1-1 3 0 0 1.32 114 180 0 0

HN 0 1 0 0 0 1.00 123 180 1 0

CA 2 1 5 9 1 1.47 123 180 1 2

HA ' 0 3 0 0 0 1.10 110 120

CB 1 3 8 0 3 1.53 110 -120 3 1

HB1' 0 5 0 0 0 1.10 109 120 0 0

HB2' 0 5 0 0 0 1.10 109 -120 0 0

HB3' 0 5 0 0 0 1.10 109 180 0 0

C' 2 31011 2 1.53 110 180 3 1

O 0 9 0 0 0 1.24 121 180 5 3

Item 3 is the colour that an active menu item gets (default is blue).