The Data Coordinating Center

The Data Coordinating Center (DCC) provides services to members of the School of Medicine. The services have as their main focus needs related to managing data of ongoing or new research projects. We specialize in the planning, development, management, and operation of systems that ensure achievement of of the goals of these projects in a technologically modern environment.

About Us

We are a service center of the School of Medicine that is based in its Department of Health Research and Policy. The DCC was launched formally in March 2002 to fulfill the needs of research in the Stanford University School of Medicine.

We are best reached via email. Please send email to hrpdcc@lists.stanford.edu with a brief description of your project and we will get back to you.

Organization

The DCC is headed by Balasubramanian Narasimhan and overseen by a senior advisory board consisting of Professor Olshen chairs the board.

People

The current staff consists of

Services

The DCC provides several services to School of Medicine research activities, contingent upon the availability of sufficient resources on both DCC and project ends. These services can be divided broadly into several categories.

Planning

DCC personnel work with investigators to plan for their needs as they pertain to data. The ideal engagement occurs at the time a project is being conceived or a grant application is being written, rather than later, although we realize this may not be possible for some projects. The planning phase involves

Infrastructure

The DCC infrastructure enables us to provide the following technological capabilities to clients.

Security

The DCC provides investigators with a secure environment to store their data. Attention is paid to both physical security (locks and keys) and to the security of data (on site and off-site backups, encryption, authentication, role-based access). The DCC uses Secure Socket Layer (SSL) connections when providing researchers to role-based access the data over the web. Security procedures are continuously monitored and upgraded as necessary. The DCC works with the Stanford University Privacy and Data Security Officer to comply with all HIPAA regulations as they pertain to a project.

Science and Education

The DCC has close connections with the Division of Biostatistics, the Department of Statistics and the Department of Genetics. The DCC brings developments in computer-intensive statistical inference to bear upon our data. Our collaborations have resulted in production of widely used tools such as enhancements to CARTR (Olshen et. al.), and original development of SAM (Tibshirani, Narasimhan et.al.) and PAM (Hastie, Tibshirani, Narasimhan, et. al.).

DCC personnel are also involved in education of colleagues and investigators. Our staff have given seminars and lectures on data management, security in classrooms and seminars as opportunities permit.

Current Projects

The DCC is involved in the following projects.
SAPPHIRe
The Stanford Asian Pacific Program in Hypertension and Insulin Resistance (SAPPHIRe) is part of the Family Blood Pressure Program (FBPP) network and is funded by NHLBI. The first phase of this project was a collaboration among Stanford University, Hawaii and Taiwan. In 2000, this project entered a second follow-up phase. Dr. Thomas Quertermous, William G. Irwin Professor in Cardiovascular Medicine, is the principal investigator.

The DCC is involved in data entry, management and reporting for this project. The initial application used a Sybase database with a Perl/CGI interface. The application was completely rewritten and ported to a modern Java/Oracle interface and in the process a number of enhancements and new features were added to this project.

PIMA
This is an NIH-funded clinical trial headed by Dr. Bryan Myers, Professor and Chief, Division of Nephrology, Stanford University School of Medicine in collaboration with Robert G. Nelson of NIH. In this study on the population of PIMA Indians in Arizona, individuals are entered into a randomized, controlled trial of losartan plus standard care versus standard care over several years. The DCC is involved in building data entry and reporting systems for the entire project.

NOPain
This study deals with the manipulation of the nitrous oxide synthase pathway in arterial disease using L-arginine. It consists of two parts, one a dose-ranging study and another a randomized controlled clinical trial. The DCC is involved in designing the data entry systems and reporting systems for the entire project. Dr. John Cooke, Professor of Medicine and Director, Section of Vascular Medicine, Stanford University School of Medicine is the principal investigator.

Genetic Determinants of PAD
This is a large study of the genetic determinants that increase the propensity of an individual to develop hemodynamically significant atherosclerosis in the arteries of a lower extremity. Through these efforts investigators will also examine the interactions of genetic determinants with known risk factors for atherosclerosis. Principal Investigator is Dr. John P. Cooke, with co-PI Dr. Thomas Quertermous. This project therefore dovetails well with the SAPPHIRe and NOPain projects and with the Reynolds Center in that DCC technologies brought to bear upon the earlier projects will enable our work here. Expertise at finding SNPs, as in SAPPHIRe and the Reynolds Center, will figure here, and so, too, will microarray analysis. This project will be somewhat different from the others in that genotyping will be done in the Cardiovascular Research Center on the Stanford Campus proper. Our approach via the Web will once again prove important.

CHIPCSD
The Children's Health Initiative (CHI) funded a project for creating a pediatric cardiac surgery database. The goal is to build a database that is geared both to research and to patient care. Our main contact is Dr. Daniel Bernstein, Professor and Chief, Division of Cardiology in the Department of Pediatrics, Stanford University School of Medicine . The project is currently under development and is expected to go live sometime in June 2003.

Hypoxic Cytotoxins
This project consists of four sub-projects each dealing with a different aspect of cytotoxic drug treatment for cancer. Project 1 seeks to design, synthesize and further develop several series of small-molecule drugs for each of the other projects. Project 2 will develop new prodrugs that become activated to cytotoxic anticancer drugs by the nonpathogenic obligate anaerobe C sporogenes genetically engineered to express the prodrug-activating enzymes. Project 3 aims to develop an improved analog of the hypoxia-selective cytotoxin tirapazamine (TPZ) and the last, Project 4 hopes to find drugs that are preferentially toxic to cells expressing the hypoxia inducible transcription factor, HIF-1a. This is an effort led by Dr. Martin Brown, Professor of Radiation Oncology, in collaboration with researchers in New Zealand. The project is currently under development and expected to go live some time in June 2003.

The Reynolds Center at Stanford
The aim of the Donald W. Reynolds Cardiovascular Clinical Research Center at Stanford University is to provide better care for patients with heart disease through the application of modern genetic approaches. Dr. Mark Hlatky, is Director of the Center, which has a strong collaboration with Kaiser Research in Oakland. Projects seek to utilize the techniques of modern molecular biology to identify genes for which abnormalities predispose to heart disease in a specific way. These genes will then be examined for unique mutations that can serve as markers to track disease in larger populations. The project is large in scope and consists of several subprojects.

The DCC is involved in many activities of the Reynolds Center. We have built systems for recruitment, scheduling clinic visits, generating reports and result letters, clinical visit data collection, barcode generation, and sample tracking. As the analysis phase of the project ramps us, the DCC is the place where the final summary data will reside. Systems are under development to tailor reports to authorized users of the data for scientific analysis.

Prospective Randomized Study of Elective Colon and Rectal Surgery, With and Without Mechanical Bowel Preparation
This study is undertaken with the leadership of Drs. Mark Welton and Andrew Shelton of the Department of Surgery. The goal is to compare rates of infectious complications and rates at which bowel re-attachments separate in elective colon and rectal surgery, with and without mechanical cleansing (purging) of the bowel. Again, we in the DCC work with the investigators to design forms and to enable entering data over the Web, as well as successfully to archive the data for future purposes.

Dr. Ronald Levy's Lymphoma Program
The Levy Lab, under the direction of Dr. Ronald Levy, has been studying the treatment of non-Hodgkin's lymphomas, improving therapy of this cancer, understanding their pathogenesis and studying normal lymphoctye biology. Research in monoclonal antibody therapies and tumor vaccines is ongoing. The DCC will work with investigators to design forms and to enable entering data over the Web, as well as successfully to archive the data for future purposes.

TA: Viral and Host Mechanisms
Pathophysiology of transplant coronary vasculopathy focusing on the role of diabetes and CMV infection. Noninvasive diagnosis of cardiac allograft rejection; pathobiology of graft rejection.

Technologies

A founding principle of the DCC was that it would bring to bear developments in
Free Software, Open Source Software and Web technologies to bear on its activities. The following are some of the software and tools used at the DCC.

GNU/Linux
The DCC servers are all GNU Linux based systems. GNU/Linux systems provide us with a solid, secure and stable environment at a fraction of the cost of other platforms. In particular, we use a hardened version of Linux developed here at Stanford University.
Oracle
We use oracle as our core database software. Oracle is the premier database program and has solid support on the Linux platform.
Java
As most of the services provided by the DCC are Web-based, we make extensive use of Java technology from Sun Microsystems.
Apache
We run the excellent Apache as our web server with SSL enabled.
Jakarta Project tools
The Jakarta project is the source for most of our development tools. At the DCC, we use
  • Ant, the Java-based tool for building web applications
  • Tomcat, the well-known servlet container for serving up XML, the eXtensible Markup Language tools such as the Xerces parser and the Xalan style sheet processor. In particular, we also use SVG, Batik, and FOP.
  • ECS, the Element Construction Kit for generating dynamic Web pages.
  • Log4j, a logging library for Java
  • POI, a library for Java for dynamically generating OLE documents on the fly
  • REGEXP, a regular expression library for dynamically validating form fields and building indigenous tools. We also use the GNU Regexp library.
  • Taglibs, useful library of custom tags for use with Java Server Pages
  • Struts, a model-view-controller framework for constructing web applications with servlets and JavaServer Pages
GNU
We use a number of Free Software Foundation tools such as Autoconf and Emacs, (with JDE).
R
For statistical analysis, we use R, a modern statistical environment for data analysis. Using remote connection packages, we use R at the backend for generating statistical analysis and plots.
Tigris
We use design tools such as Argouml, a UML design tool with cognitive support.
DataVision
We use DataVision for generating reports.

We are also evaluating and testing a number of other software such as JBOSS, Eclipse and additional commercial tools that may not have exact open source equivalents.

Of course, there are situations when no existing software can fit the need. The DCC has developed indigenous tools in such cases to fill the need.

Contact DCC
Last modified: August 1, 2004