Factor Analysis/Restricted Multiple Regression Procedures using the Pascal version of the code A. Running Factor Analysis/Restricted Multiple Regression Program I. Installation and Execution of the Program II. File Generation III. Running Factor Analysis (FA) IV. Running Restricted Multiple Regression (RMR) B. File Description A. Running Factor Analysis/Restricted Multiple Regression Program I. Installation and Execution of the Program The program runs on IBM-PC under MS-DOS operating system (not Windows!). The program does not work with Windows NT OS. To install the program, simply uncompress Farmr.zip into a directory. To run the program under Windows 95/98, double-click on 'go.bat'. Before the program is executed, the computer will first be rebooted in the MS-DOS mode. After the program is finished, the computer will be rebooted back in Windows 95/98 mode. II. File Generation The following input files are required: 1. Spectra. 2. names.dat - text file with the names of the spectra files. 3. xray.dat - text file with secondary structure content of proteins in the training set, determined from the X-ray structures. 4. Z_test.dat - text files containing parameters for the Z-test. Spectra should be in ASCII-XY format usually with X=wavenumbers, Y=intensities (absorbance or ?A, etc.) with .PRN extension. The total number of pairs of data should not exceed 1190. (Sample hardcopies of files are attached, see part B.) Example: 1. Create directories C:\Data and C:\Results. 2. Transfer the ASCII (spectra) files to C:\Data 3. Create Fullname.dat which contains the filenames of the spectra (without extension). Make sure that there is no carriage return beyond the last filename. 4. Copy Fullname.dat to Names.dat. 5. Order the X-ray data in Xray.dat according to the sequence of the spectra in Names.dat. 6. Copy Z_test.dat into C:\Data. 7. Check that C:\Results directory is present and clean. III. Running Factor Analysis (FA) 1. Start the Program. 2 Click on Functions. Choose Menu. The Main Menu will be shown. A. Assigning paths 1. Click on Assigning Paths. 2. Confirm that the paths are C:\DATA and C:\RESULTS. 3. Alt-X. Youíll be shown the Main Menu again. B. Preprocessing 1. Click on Preprocessing. 2. Click Spectra, then Preprocessing. 3. Click All Actions so that the program does all the procedures (Global Info, Frequency Table generation and Creation of *.SPK). 4. Input the number of points for the resulting *.SPK files (typically 200). - creates the files (*.SPK files) to be used for the FA. - when you get ëOKí for all the three steps, close the window. 5. Click on Spectra, and Exit. Youíll be back to the Main Menu C. Create PAS code 1. Click on Create PAS code. 2. Just check that the following information are correct: a. number of points b. number of spectra c. minimum and maximum frequency d. Factor is picked 3. Click Create PAS Code. 4. When the code was successfully created, click Close to close the menu. 5. Alt-X to go back to Main Menu. D. Run Factor Analysis 1. Click Factor Analysis in Main Menu. 2. Click Spectra, then Calculate. 3. Click All Actions to run the four steps (Normalization of spectra, generation of correlation matrix, diagonalization and subspectra generation) 4. When it is finished, close the window. 5. Go to Main Menu by pressing Alt-X. IV. Running Restricted Multiple Regression (RMR) This is done most conveniently after the FA at which stage all files are in their correct directories. A. Assigning paths 1. Click on Assigning Paths. 2. Check that the directories are correct. (C:\DATA and C:\RESULTS) 3. Alt-X. Youíll be shown the Main Menu again. B. Create PAS code 1. Click on Create PAS code. 2. Just check that Regression is picked. 3. Input the number of subspectra in Regression Coefficients. 4. Input the number of proteins with known X-ray data in RTG files. 5. Input the number of protein secondary structures desired in Protein Structures. 6. Input the number of results of regression to be considered in Best N means. 3. Click Create PAS Code. 4. When the code was successfully created, close the menu. 5. Choose Regression. D. Regression 1. Click Spectra, then Calculate. 2. Click Fit. a. Click Assign X and choose coef.mat. b. Click Assign Y and choose xray.dat c. Click Assign Output and input the name (example:.am3f.dat ). d. Click Assign Fit and input filename (example: am3f.po ). e. Click Done. f. When the fitting is finished (shows OK), proceed to prediction. 3. Click Prediction. a. Click Assign X and choose coef.mat. b. Click Assign Y and choose xray.dat c. Click Assign Output and input the name (example: am3p.dat). d. Click Assign Fit and input filename (example: am3f.poa ). e. Click Done. 4. Click Close. B. File Description Xray.dat ! conv = 4 0 40.5 9.28 19.8 30.4 ! cytoc = 6 42.7 0 15.5 8.74 33 ! hmgl = 9 62.7 0 18.8 6.62 11.9 ! myo = 13 77.1 0 9.8 1.96 11.1 ! riboa1= 16 21 34.7 11.3 14.5 18.6 ! cran = 5 16 28.9 12.9 15.2 27 ! chysin= 3 11.8 32.1 11.4 14.4 30.4 ! glu = 8 29.3 18.7 10.4 19.3 22.3 ! lyso = 12 38.8 7.75 20.9 16.3 16.3 ! supdi = 19 1.99 38.4 14.6 20.5 24.5 ! ribs = 17 20.8 35.2 7.2 14.4 22.4 ! tryi = 22 20.7 24.1 6.9 19 29.3 ! subti = 18 30.2 17.8 15.3 12 24.7 ! lade = 11 36.8 11.3 14.3 13.1 24.6 ! aldeh = 1 24.9 20.6 14.7 13.6 26.2 ! chygn = 2 14.3 32.2 14.3 12.7 26.5 ! imun = 10 2.8 47.7 14 11.2 24.3 Z_test.dat ! Zdroj Andel,J.:Matematicka statistika, str.329 ! F_(1,21)(0.01) 8.02 ! F_(2,20)(0.01) 5.85 ! F_(3,19)(0.01) 5.01 ! F_(4,18)(0.01) 4.58 ! F_(5,17)(0.01) 4.34 ! F_(6,16)(0.01) 4.20 ! F_(7,15)(0.01) 4.14 ! F_(8,14)(0.01) 4.14 ! F_(9,13)(0.01) 4.19 ! F_(10,12)(0.01) 4.30 ! F_(11,11)(0.01) 4.46 ! F_(10,12)(0.01) 4.30 ! F_(9,13)(0.01) 4.19 ! F_(8,14)(0.01) 4.14 ! F_(7,15)(0.01) 4.14 ! F_(6,16)(0.01) 4.20 ! F_(5,17)(0.01) 4.34 Fullname.dat alb1v myo1v hem1v can1v cht1v sdm1v cah1v pap1v lys1v rna1v tln1v rns1v cytt1v grs1v adh1v cga1v rei2v pti1v ldh1v lcf1v rhd1v sbt1v Page: 1 pcfarmr.doc 11/01/00 12:49 PM