Skip navigation

STANFORD UNIVERSITY

INFORMATION TECHNOLOGY SERVICES

WWW Strategic Vision

Written by Russ Allbery & Tim Torgenrud

This document tries to give the strategic vision for web content delivery services at Stanford, focusing specifically on the centralized services provided by ITS. It also covers infrastructure support for applications provided on the web, but doesn't include the applications themselves (as nearly all applications now have a web interface).

Primarily covered here are the main campus www.stanford.edu web servers and the sites they serve, the supporting CGI servers for user-written web applications, and the campus proxy and virtual hosting services. While WebAuth is mentioned, it is primarily covered in the authentication vision as an authentication service. Web interfaces for specific services (such as webmail) are not covered here.

Principles

The web has become a primary content delivery system and user interface to applications in almost every area of computing. It is the main public face that institutions, working groups, and individuals present to the Internet at large and the preferred mechanism for content delivery. As a result, it is also a hotbed of activity around new technologies, new types of content, new applications, and new work styles.

To some degree, every technological area has to keep abreast of developments around the web. The purpose of the campus WWW infrastructure is not to address all web problems but rather to focus on providing a basic and universal infrastructure that:

  • Provides basic content delivery services and basic support for simple user web applications to the entire campus community in a way that allows use of the core services as a foundation on which web sites and applications can be built.

  • Provides foundational tools and integration mechanisms for use in the construction of web applications, specifically including authentication and directory integration.

  • Provides a set of solutions for the most common web needs across the organization in the areas where a central service can provide the maximum benefit for the least cost. An example would be basic web form support.

  • Brings together various pieces of the central campus infrastructure in a modular way, allowing application developers throughout the institution to more easily integrate with existing infrastructure and avoid having to reinvent solutions to solved problems.

For web services, even more than other services, we can't do everything. Special attention should therefore be paid to the Foundational principle in VisionPrinciples. The goal is to set up a stable, secure, well-understood, and well-documented foundation upon which the campus community can explore and build all the varied types of web applications in current use.

Currently, the central web services are significantly behind current technology and are not providing adequate service to the campus community. We are as far as five years behind current best practices for web infrastructure in some areas, and as a result much of the community has been driven to developing their own duplicate services and running their own servers. Without some attention and improvement, our web services are in significant danger of becoming irrelevant to the needs of the campus.

Technologies

Stable core technologies:

  • Apache 2.x HTTP server for Unix-based web applications and general web hosting needs.

  • Microsoft IIS for Windows-based web applications.

  • Firefox, Internet Explorer, and Safari for client browsers.

  • WebAuth v3 for web authentication and LDAP directory integration.

  • Tomcat for Java servlets (currently a mix of Tomcat 3 and Tomcat 4, moving to Tomcat 4 and 5).

  • PHP 4, Perl 5.6 (moving to 5.8), and Python 2 for CGI scripting and web applications that don't use Java.

  • MySQL as a database engine for web applications.

  • The Google search appliance for intranet searching.

  • AFS as a shared file system for storing content that should be served out of the central campus web infrastructure and for managing ACL controls on modifying that content.

  • The standard set of W3C standards for web content, including HTML 4, XHTML, CSS, DOM, as well as Javascript for Dynamic HTML.

  • Server-parsed HTML for some basic web page programmability needs.

  • PAC-based proxying through a WebAuth-enabled server to provide authenticated access to purchased content such as academic journals.

Technologies new to IT Services:

  • Blogs, including the various syndication and aggregation protocols (RSS, Atom, etc.).

  • Content management systems such as Zope for easier maintenance of whole web sites and broader basic toolsets for common tasks.

  • Web forum software, such as phpBB (but hopefully better).

Emerging technologies: (see Research below)

  • Coursework, the next generation (Sakai).

  • WebDAV for both web site management and generalized version control.

  • XML and XSLT for general web content delivery (both are already in use for system integration).

Deprecated technologies: (see Projects below)

  • Apache 1, including WebAuth v2.

  • Formage (see Projects).

  • Docushare, which has been slated for retirement for quite some time now, but is still in widespread use internally. We won't be able to truly eliminate the use of Docushare until we have a document management replacement.

Other technologies in use: (These technologies are currently deployed and useful in specific circumstances, but either are not attractive for broader use or have a limited scope of applicability -- we are neither recommending expanding them nor recommending eliminating them at this time.)

  • There are many different client browsers in use. Firefox, Internet Explorer, and Safari are the primary browsers in use by the campus community and by most of the clients accessing our web sites from elsewhere, along with currently substantial use of Netscape and Mozilla. Netscape and Mozilla usage is expected to drop steadily. It makes sense for these browsers to be our primary focus when it comes to user training and user software support, but our web services should follow standards and work with any standards-compliant browser.

Projects

First:

  • Migrate the CGI service to Linux, as the last major component of the web infrastructure still running on Solaris. This lets us standardize on the same infrastructure packages on all of our web services. As part of this upgrade, revisit the CGI configuration and provide better resource limit handling and debugging control.

  • Upgrade to PHP 5 on the CGI service, making both PHP 5 and PHP 4 available for the transition.

  • Support restricted embedded PHP on the main www.stanford.edu servers for particular departments and clients (like the Stanford home page) who can be trusted to use it effectively, possibly eventually expanding this service to the general community.

  • Move the MySQL database service into official production, with a way for users to request databases for their applications and documentation on our support model and basic usage.

  • Provide an automated log analysis service that, rather than giving users raw log dumps, runs Analog or some equivalent program on those logs and provides them with the report. This would remove the need for much of the current log dumping service as well as provide directly what most of the clients are really looking for.

  • Pick a solid wiki product among the many available and turn it into a production service, integrating it into the campus authentication system. Confluence looks like the current front-runner.

Next:

  • Replace Formage. This is a home-grown web form builder application that provides some simple form submission handling, upon which some users have built amazingly complex web applications. It has stability problems, is entirely home-grown, and is not easily supportable. It is also obsolete; there are now many content management systems that provide far more features. We need to select one and figure out how to migrate existing Formage users.

  • Provide a centrally managed blog service, both for its own sake and as a platform for later expansion into RSS-based services.

  • Finish and put into production use of WebAuth's authority delegation support to allow web-based access to other services with the user's credentials, clearly documenting how this can be done and setting up policies and procedures for applications that want to use this.

Later:

  • Provide a web interface to AFS, at least at the level of file upload, deletion, moves, and ACL changes in designated portions of AFS. This will be far easier with Kerberos v5 AFS and should probably wait until that work is done (see the authentication vision).

  • Provide a campus WebDAV service, possibly with integration into AFS as a secondary objective, but primarily to support workgroup file sharing, storing files on a remote server for backup purposes, and document management and version control. Ideally, provide the WebDAV service in conjunction with a more general document management service. (If we adopt Confluence as a wiki, we may be able to use its WebDAV support for this.)

  • Adopt a general content management system at least for IT Services web pages and ideally as a campus service as well.

  • Adopt a general document management system that can fully replace Docushare. This may be dealt with through the combination of Confluence, WebDAV, and a content management system, but we have to make sure we have all of the current usages of Docushare covered.

Research

The following areas should be explored with an eye to their long-term inclusion in our WWW strategy. Without more information, it is premature to specifically identify any of these areas for projects, but if the research pans out, they may move from this section into the project section for a full production implementation. Each research project is associated with an initial application that we can use to test the results of the research.

  • Integration with the next generation of the Coursework service is extremely important for the future of web content services at Stanford, since much of what we serve out is academic content and Coursework has become the primary computer repository of supporting information for courses. What exactly Coursework will need is not yet clear, but it's important that we stay in close contact with them and be prepared to adjust our web strategic plan to meet their needs as much as possible.

  • It's not yet clear how best we can use RSS and similar aggregation and web-based news feed concepts most effectively as part of the web infrastructure, but it's clear that this will be a very important part of web services in the future. We need to follow developments here and look for interesting applications that we can deploy, while starting with deploying a general blog service as an obvious first step.

  • We currently have a limited phpBB offering specifically for Coursework integration, but user demand for web forums of some sort is clear. What's less clear is what software to use, how it should integrate with email and Usenet (if at all), and how to maintain that software centrally in a secure fashion and tie it into our exiting identity management and authentication infrastructure. Our web history is littered with various solutions to this problem that we've run for a while and then abandoned. We need to carefully think through what we want here and adopt a solution that will hopefully last better than the ones we've tried in the past, and do this in conjunction with Coursework. It's possible that phpBB is the best available, but this seems unlikely.

  • Server changes and infrastructure changes may be needed to fully support some of the more interesting and exciting work in web content description, particularly around XHTML, XML, and XSLT. Some of this may tie into RSS work. Right now, there is no pressing need for server changes, but we don't want to be caught by surprise.

Last modified Wednesday, 27-Jun-2007 03:23:11 PM

Stanford University Home Page