Folding@home SMP FAQ

Intel Core™ 2 Quad

Table of Contents

Introduction

Since 2000, Folding@home (FAH) has led to a major jump in the capabilities of molecular simulation. By joining together hundreds of thousands of PCs throughout the world, calculations, which were previously considered impossible, have now become routine. FAH has targeted the study of protein folding and protein folding disease, and numerous scientific advances have come from the project.

Now in 2006, we are looking forward to other methods to produce major advance in capabilities on top of what we can do with distributed computing. We have previously announced our support of high performance Graphics Processing Units (GPUs) from ATI as well as the new Cell processor in Sony's PlayStation 3 to achieve performance previously only possible on supercomputers.

We are also releasing another type of client, the Folding@home SMP client. SMP means "Symmetric Multi-processing" and it is a term that generally refers to the situation where a computer has more than one processor core. Dual core CPU's are pretty common and even 4-core boxes (currently implemented as dual socket dual core computers, such as Apple's Mac Pro) are becoming common. With advances from Intel and AMD, quad core processors are on the horizon, with 8-core and even 16-core boxes soon to become common.

The goals of the SMP client and the GPU client are similar: in order to tackle many of the problems of interest (especially related to protein misfolding and aggregation, such as in Alzheimer's Disease), we need to not just have lots of computers participating, but we need results returned more quickly so that we can simulate trajectories of sufficient length. Right now, we achieve this by running for many months to years (indeed, our first Alzheimer's Disease simulations ran for almost two years straight). That's where the SMP and GPU (and PS3) clients come in. They give us considerably longer trajectories in the same wall clock time, allowing us to turn what used to take years to simulate even on FAH, to a few weeks to months.

Moreover, the SMP and GPU clients are complementary. The GPU client can greatly (~30x) speed up a specific type of calculation (implicit solvent calculations), whereas the SMP client can lead to a 4x speed up over the complete range of calculations we need to run. Even a 4x speed up is significant here, since it affects virtually all types of FAH calculations, turning a year's worth of work into a few months. As multi-core CPUs become more common, we expect this trend to become more and more important, especially as 8-core boxes (with dual 4-core CPUs) have already been announced.

Our goal is to apply this new technology to dramatically advance the capabilities of Folding@home, applying our simulations to further study of protein folding and related diseases, including Alzheimer's Disease, Huntington's Disease, and certain forms of cancer. With these computational advances, coupled with new simulation methodologies to harness the new techniques, we will be able to address questions previously considered impossible to tackle computationally, and make even greater impacts on our knowledge of folding and folding related diseases.

How to Run the FAH SMP Client

Important: the FAH v6 beta client has replaced the v5 SMP client for Linux and OSX platforms. Several settings have changed. Please read the v6 instructions before running the client on these platforms.

There are guides to install the SMP client available on the Guide section (The Guide button above will go there as well). Here is a list of the available SMP installation guides :

Known Bugs and Issues

Please note that this is a beta release. While we have done lots of testing in house, there are limits to the bugs we can find in these limited tests (and hence the need for a beta test). Thus, we expect that there will be many problems with the client that need to be resolved. Below is a list of some of the relevant known issues or bugs for beta testers of this new client.

When running an integrated client, the -smp (and -deino for the Windows SMP DEINO client) flag must be used to request SMP work units. Under some circumstances, omitting the SMP option may cause loss of the current work unit.

  1. The core needs some time (often as much as 4 minutes) between work units to finalize work.
  2. Printing to the log file sometimes gets weird due to multiple threads
  3. The Linux client (and sometimes the OSX client) does not correctly detect RAM levels 2GB or greater. In this situation, please make sure to set the RAM by hand in the client configuration process (to reconfigure, run the client with the -configonly flag). If the memory is not read correctly, it defaults to a very low value. If your OSX client detects some huge amount of memory (eg 4294965248 MB), then you need to configure the RAM by hand with -configonly.
  4. The core will print "No option -tpi" 4 times during the start up of the core
  5. The core does not clean up its work files completely (can leave several files after the WU has completed)

Notes for Running

  1. We strongly suggest people run this client on 4-core boxes. While it will run on 2-core boxes, we have noticed some potential problems (we are looking into these issues now).
  2. Most SMP WU's will be "big" WU's, so you'll have to configure the client that way. During the client configuration, when the client asks "Allow receipt of work assignments and return of work results greater than 5MB in size (such work units may have large memory demands) (no/yes)" say yes.
  3. There is a brief pause (15-20 seconds) at the end of each WU. This is so we can make sure all the threads sync up. This is not a bug, as much as a limitation of SMP needing to synchronize the threads before moving on to the next WU.
  4. The SMP core can get confused about disabling SSE, so we suggest running with the -forceasm flag if you notice that the SSE was disabled unnecessarily.
  5. The Linux client is a 32-bit executable, as we are planning on using a single client binary for SMP and non-SMP. However, this means that 64-bit Linux distros will need to have 32-bit ELF support enabled. In several popular Linux distros this can be enabled by installing the ia32-libs package.

Policy Notes

  1. The Windows SMP client will stop working after 6 months (this is a limited release beta -- new clients will be available before the current version ends its test period)
  2. Deadlines will be set to be much shorter than normal, as we need to get data back quickly in this beta test and we are releasing to a very specific set of hardware. This will likely change in time, as we move from a beta test and as we move towards supporting other platforms. However, the deadlines will be much shorter than the deadlines for normal WUs (as the reason for high performance clients like the SMP client is high performance!).
  3. If a server with SMP WU's is not available or is overloaded, the client will be assigned to 0.0.0.0, which tells the client to wait and try again. As more SMP servers come on line, this won't be an issue, but during the beta test, we want to keep the SMP clients crunching SMP WU's.

Previous bugs fixed in the current version

  1. Checkpointing was previously not working (this means that if you quit the client and then restart it, the client will start from the beginning of the WU).
  2. In previous clients, the core can sometimes get confused about whether the previous core ended cleanly and will turn off assembly loops. In cases like this, we strongly suggest that people run with the -forceasm flag.

Frequently Asked Questions

What operating systems will be supported?

We support three operating systems: Windows, Mac OSX/Intel, and 64-bit Linux. We are working to port to 32-bit Linux and hope to have that ready to beta test soon. The Windows version of the client has recently been released and runs under both 32-bit and 64-bit Windows.

How many cores do I need to run this? What types of CPUs?

In the beta test, we are strongly recommending that this code be run on 4-core boxes, although it can be run on 2-core boxes with reasonable performance. The code does best on Core 2 Duo/Woodcrest class chips and we recommend these systems (new iMacs, Mac Pro's, etc).

Does the FAH SMP client run the same WUs as the regular FAH client?

No, the SMP will run a set of WUs specially constructed for the SMP Core_a1's new functionality. While the SMP Core_a1 WUs use the same file format as Gromacs WUs, the scientific code, which performs the calculation, is different and the WUs for Core_a1 will yield incorrect results (simply not run) if run with Gromacs (and vice versa).

What scientific cores does the FAH SMP client support? Only Gromacs cores? Others cores like Amber?

We will support a particular Gromacs core for SMP processors only (Core_a1). Other core support (Amber or Tinker) is possible, but not on our current roadmap.

How long do you estimate this program will remain a beta before it turns into a final client?

This is hard to predict, as it depends on how well the code works 'in the wild.' Also, using multi-core processors for a single calculation in distributed computing is itself new and so there may be unexpected consequences that nobody could foresee.

How do you decide the credit value of SMP work units?

Points are determined by the performance of a given machine relative to a benchmark machine, similar to the CPU client benchmark process. Before releasing any new project (series of work units), we benchmark it on a dedicated Macintosh Pro with 2 - 2.33 GHz Dual Core Xeon processors. (more specifically, 2 Woodcrest 5140 processors with 4 MB cache (each), 5 GB FBDIMM Memory (667 MHz DDR2), 1.33 GHz Bus)

We plug the results of this benchmark test into the following formula:

points = 1760 * (daysPerWU)

where daysPerWU is the number of days it took to complete the work unit.

Please note the very concept of a reference machine will mean that some WU performance will vary from the performance on your machine. Even between various Xeon processors, there are significant differences in architectures. Moreover, there are variations between WUs within a given project which can lead to speed differences.

Our goal is consistency within a given definition of a reference machine setup (described above), but beyond that, the natural variation from machine to machine and WU to WU will never allow any point system to perfectly predict what you get on your machine.

Why is the SMP client important, and why is the benchmark set at that level?

The purpose of the SMP client is twofold: to take advantage of the high-performance capabilities of recent multiprocessor systems and to help develop a simulation architecture that will become one of the dominant FAH computing paradigms as multi-core chips become an industry standard over the next several years. High-performance clients enable us to run types of calculations that would be impractical on our standard architecture--calculations that enhance our scientific capabilities, and your scientific contributions, significantly.

High-performance clients often require more computing resources. SMP clients typically run on dedicated systems, 24 hours a day, and use more processing power, more disk space, more network resources, more system memory, etc. Also, a major part of the scientific benefit is dependent on rapid turnaround of work units; hence we assign short deadlines for SMP work units. To reward those contributors for donating resources beyond the typical CPU client, for completing these work units very quickly within the short deadlines, and for contributing to the development of our next-generation capabilities, we currently set a benchmark value (with included bonus*) proportional to these larger more demanding SMP work units. Without the SMP clients and your additional contributions, we would not be able to complete many important projects. *Please note the bonus value is subject to change.

What about hyperthreaded (HT) CPUs?

The SMP client was originally intended for multi-core CPUs, which generally do not support HT. For machines with 2 physical CPUs, we do recommend enabling HT for the SMP client as this presents the operating system with what looks like 4 logical processors (and our SMP client is intended for 4 processors). If you have 4 physical CPUs, we recommend against using HT, as this presents the operating system with 8 logical processors, which will make the SMP client run inefficiently (especially since the logical processors coming from HT run much slower than the normal ones).

Why use MPI? Why not threads?

None of our engines are written to be thread-safe or multi-threaded. The only parallelizable codes (Gromacs and AMBER) both use MPI. Making Gromacs use only threads for parallelization isn't possible right now (we talk with the Gromacs developers frequently on this issue), so MPI is the only solution.

How well does MPI work?

The short answer is pretty well on Linux and OSX and not so well on Windows. MPI was originally developed on UNIX, so this is not a surprise (and it's a great feat in many ways for it to even run on Windows). The Windows specific quirks we're seeing are due to MPI-Windows interaction, and we're trying to hunt them down, as well as try out other MPI possibilities.

Why lock to four processes?

Gromacs in all released versions currently breaks up code to set up calculations and those to run them and the number of SMP processors is decided at setup (Grompp) not running (mdrun). MDRUN is the code running in the FAH core, so it has to have a fixed number of SMP processes. We are investigating possible options to change this.

Isn't it needlessly complex to use MPI?

Unfortunately, there aren't other options right now (see the above).

Isn't MPI really meant for clustering computers together?

Yes and no. It originally started that way, but with multi-cpu/multi-core boxes, it has become a natural solution there too (as one can code for MPI and run on both architectures).

Does that mean that FAH could support multi-box clusters?

That's on our mind, but we want to try to get SMP working smoothly before going to far in that direction.

Troubleshooting The Client

The client has trouble making connections and shows MPI errors such as " Fatal error in MPI_Wait: Other MPI error, error stack:"

Check out the advice from Pogo, who found an issue with the loopback device giving trouble. Here are simple steps to detect the issue: - run "hostname" to get your local hostname - run "ping <output from hostname>" - look at response times from ping - if it's over 1ms you have a problem - or run "traceroute <output from hostname>" - there better not be any hops in the route

Fix: - change your local hostname to something NOT pingable on the internet (i.e. do a ping or nslookup on that name from some other internet connected machine) - add local host name to the "127.0.0.1 localhost" line in /etc/hosts

Why does the client fail to start, and what does the error mean?

If the client reports an error code -1, it is likely due to an MPI problem. Try re-registering mpiexec by running install.bat again.

If the client fails to start, and the fahlog.ext shows a CoreStatus = 63 (99) error, that is a permissions problem, or a MPI registration problem, or both. To correct this, set the properties of the fah.exe file to "Run as Administrator" and then run the "install.bat" file again to register MPI and the SMPD service.

When something happens to my network (changing settings or other tweaks), the FAH/SMP client has problems.

One issue with the SMP client is that the client uses MPI to handle multiple processors and MPI uses the network system (albeit on the local loopback device). If the network is tweaked during a run, this can cause problems for the loopback device, causing problems with MPI, causing Gromacs to stop processing. The same happens when a WIFI signal drops or goes out of range and comes back.

We are looking into this and in particular whether we can detect this well enough such that the client restarts from a checkpoint (best case scenario). For now, please don't change the network settings while FAH/SMP is running (you can always stop the FAH client, change the settings, and then restart the client later).

On some home routers, it has been found that DHCP lease renewals have caused the same issue as changing network settings. Setting a Static IP address instead will avoid that problem. And an Ethernet cable will help to avoid the problems of a WIFI connection. And these problems are less prevalent in Windows Vista and 2008, as the IP stack has several updates that Windows 2000 and XP do not.

How to prevent network drops from killing the Windows SMP folding client in Windows XP

This guide was adapted from one in the [H]ardForum. Please see http://www.hardforum.com/showthread.php?t=1286217 for the original version. Thanks to Killer[MoB] who developed it.

The SMP client often gets frozen or seemingly hung up when the box is otherwise running just fine. The problem 99.9% of the time is a network connection issue. Any kind of issue, such as a bad cable, a faulty port on a hub, router or switch, a NIC in power save mode or just flakey wireless connection problems could cause the folding client to stop processing. The problem lies in the way the client uses the network for loopback with the folding cores communicating with each other.

Loopback is the problem and also the solution to the problem. The solution is to install the Microsoft Loopback Adapter. It will help maintain a constant network connection within your own machine. Although this is not its intended purpose, it does work for solving this folding problem. The major difference?? Now I can unplug the network cable completely and it never misses a beat.

What is it and is it a piece of hardware?

It is not hardware. It is a virtual network adapter installed in some cases when you need to test network related things and there is no network available. It is also used when running some virtual machines in various versions of windows.

Where is it and how can I get it?

You don't need to get it because you already have it. This is almost too easy.

How do I install it and set it up?

1) Go to Add Hardware. (Start>Control Panel>Printers and Other Devices>Add Hardware)

2) You should now be at the add hardware wizard.

  • Click Next.
  • Select "Yes, I have already connected the hardware" and click next.
  • Scroll to the bottom of the list and highlight "Add a new hardware device" and click next.
  • Select "Install the hardware that I manually select from a list (advanced)" and click next.
  • Highlight "Network adapters" and click next.
  • Highlight "Microsoft" under manufacturer and "Microsoft Loopback Adapter" under network adapter. Click next.
  • Click Next again.
  • Now click Finished.
  • Close out the Printers and other hardware window.

3) Go to network connections. (Start>Connect to>Show all connections) or (Start>My Network Places>View Network Connections)

  • You will see a new Local Area Connection that is listed as being a Microsoft Loopback Adapter. You will also see that it says Limited or no connectivity. We are about to fix that.
  • Right click on this new loopback connection and select properties.
  • Double click Internet protocol (TCP/IP).
  • Select use the following IP address.
  • Set the IP address to a 192.168.x.x number that would not normally be assigned on your network. Ex 191.168.255.200
  • Click on the subnet mask box and it should auto fill with 255.255.255.0.
  • Leave everything else blank.
  • Click OK.

This may take the machine a couple minutes to think about. Once it's done, you can close all the windows. Note that on the Network Connections window, the loopback adapter now shows connected.

Windows Vista and SMP

Windows Vista has additional security features not found in previous versions of Windows. In certain situations, these security features might interfere with the installation and/or operation of the SMP Windows client. Please see the FAH WIKI: Vista and SMP page for installation and troubleshooting assistance. The installation guides also cover this so if you need to know about the specific steps to make it work under Vista, check the Guide section for more details.

My linux machine hangs at the "4 NNODES" line.

Here's a suggested fix by "Jimmy2Shoe" (see this thread) for RedHat FC6 (the method that I use might diffrer from for diffrent distros...)

  1. System --> Administration --> Network: In devices, select eth0, or whichever device is your net connection. Press EDIT.
  2. In DHCP settings, put in a hostname of your choice, something original, and exit that window.
  3. In the "DNS" Tab, use the same hostname you selected.
  4. Close network configuration, save settings.
  5. Open up the hosts file under my computer --> filesystem --> etc --> hosts.
  6. The text editor now open, type in the first line: 127.0.0.1 [press tab] hostname you selected above
  7. Save and close the text editor. Reboot.

The main jist of this is to make sure that your localhost is setup in a way that the MPI libraries like.

For More Information, Please See:


Last Updated on February 06, 2009, at 08:48 PM by