New Mexico Supercomputing Challenge | |||||||||||
|
|||||||||||
|
Challenge Team Interim Report
The goal of our project is to successfully model the tertiary structure of proteins in a three dimensional environment. Because of the complex configuration of proteins and the bonds that emerge from the molecules interacting with each other, the program will require rigorous computations and make use of supercomputing capabilities. The program will also make use of the user friendliness of the Java programming language and the efficiency of the C++ programming language. Users will input data into a graphical interface written in Java that translates the data into a file passed to the next portion of code, written in C++. Finally, the three dimensional model of the protein will be displayed for the user. If successful, this program would prove useful in the field of biology and may have implications in other areas of science. Our progress on the coding is as follows. The user interface is written in Java and accepts sequences of three letter abbreviations in a text box for each of the 20 amino acids. Accepting the sequence in a text box gives the user the flexibility to edit and view the complex composition of the protein. The interface also provides a key of the 20 amino acids and their abbreviations. The second module is progressing less quickly. The basic form is outlined in C++ code. The basic outline of the second module below. Each line is a different function. The main function simply implements them. Additional functions may be added later.
Additionally, we are planning to create at least two of our own classes. These two classes are "chem.h" and "cube.h". "chem.h" will contain data and formulas that will be used in tlaconvert, charge, fold, and probably check. It will contain the 3d structures of all the amino acids in addition to formulas to determine attraction charges and atomic radii and such. "cube.h" is the class that will create the format for the 3d matrix. It will be similar in most respects to the apvector and apmatrix classes. As for the third module, we are not sure of the exact specifications yet. We have a slight dilemma in that our computer science class does not cover image production. Before we can get an outline for our third module, we will need to understand more about the way C++ produces pictures. However, the basic idea is that the computation module will give a 3d matrix to the visualization module. This program will turn all of the atoms into a picture that the user can easily comprehend. Our progress has not been stellar mostly due to the fact that none of our group knew Java or C++ before this year. We are all taking computer science courses. We have rudimentary knowledge of the language and are working hard to do what we can with limited resources. SCHEDULE This is a list of upcoming important dates. January 14, 2000 - By this date the fileio system will be completed or near completion. The Java module should be able to communicate via text documents with the C++ program. Additionally, we will have a better idea of the workings of the visualization system. It may be fully outlined, also. January 18, 2000 - The skeleton of the visualization module will be outlined in code. January 19, 2000 - The tlaconvert will be fleshed through with comments. The function may not be working, but all parts of it will be fully accounted for. January 20, 2000 - The java interface will be completed and fully functional. The bugs will be worked out of the fileio system, also. Additionally, the skeleton of the MPI interface will be commented in the code. This will give us a means to understand the implementation system of the program. January 21, 2000 - Our team will be attending the Region 4 workshop. Our program will be in the skeleton form. This allows us to receive good feedback on the code that we have done yet remain flexible on the overall structure. If drastic changes need to be made, we will be able to carry these out without having to lose a vast amount of code. January 23, 2000 - We will have use the weekend to take in all the advice that we have received from the workshop. We will revise our schedule if necessary in addition to writing the skeleton code for any additional parts that we added during the workshop. January 28, 2000 - Throughout the week we will have been dealing with the classes that are required by our program. By the en d of January 25, we will have outlined all the classes needed. By January 28, these classes will be partially completed. They will have the general appearance of their final form. January 31, 2000 - All functions will be completely commented into existence. Their skeleton code will be done. February 4, 2000 - All the classes will be completed by this date. They will be in their final or very near-final form. February 10, 2000 - The tlaconvert, chem, and charge functions will be done along with most of the visualization module. February 11, 2000 - All bugs will be taken out of the working code in preperation for the project evaluation. February 12, 2000 - Our program will go through the preliminary project evaluation. February 18, 2000 - The fold function will be mostly completed, and the entire visualization module will be in working conditi on. This does not mean it will be in final form. All this means is that it will be enough to display some semblence of a visual representati on of the data we have constructed. February 25, 2000 - Both the fold and check functions will be completed. The visualization module will be presentable. February 29, 2000 - The finalize function will be completed. At this point, the entire program will run on a microcomputer. P> March 3, 2000 - The program will be touched up to be perfect. March 10, 2000 - The program will have been overhauled totally. The program will now be able to initialize the MPI interface and gather data about the computer that it is being run on. March 17, 2000 - The MPI interface will be completed and touched up. If everything goes according to plan, the program will be ready for inspection. March 18, 2000 - Testing begins on the program with data procured from biology resources that have been gathered in the last t wo months. March 24, 2000 - Testing finishes and all kinks are worked out. The program is totally ready. March 26, 2000 - All areas of our presentation and report will be outlined and organized. March 28, 2000 - All materials will be accumulated and everything will be set up. Work on the final report begins. March 29, 2000 - Work on the presentation begins. April 2, 2000 - The presentation will be finished. Explanations and all resources will be completed, as well. The final repor t will be read over by several outside sources for last minute changes. April 3, 2000 - The final report will be completed and printed up in several hard copies. April 4, 2000 - The final report will be submitted. April 5, 2000 - DEADLINE FOR FINAL REPORTS Conclusion: Although it seems that the amount of work that is put into this project seems to increase as the project progresses, this is misleading. As we gain more and more knowledge about the programming language, things will start becoming easier. Things will be accomplished much more efficiently and quickly. Additionally, we have split the project modules between the three members of our team. As the java module will be finished by January 20, this will free up a member of our team. Whereas the visualization module is never actually done because it can always be improved upon, we will have a member of our team who can assist with the other two modules. Overall, the project is working out quite nicely for a team with a single veteran of the challenge and only three people overall. It should come together nicely to create a visually pleasing, user-friendly, and computationally powerful combination.
Team Members Sponsoring Teachers Project Advisor(s) |