HierRegressions

From FarmShare

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
== Hierarchical Linear Regressions ==
== Hierarchical Linear Regressions ==
 +
This Hierarchical Linear Model example walks through http://www.r-tutor.com/gpu-computing/rbayes/rhierlmc as implemented on FarmShare.
-
http://www.r-tutor.com/gpu-computing/rbayes/rhierlmc
+
The R libraries used here can be downloaded from: http://www.r-tutor.com/content/download
-
 
+
-
http://www.r-tutor.com/content/download
+

Revision as of 11:59, 8 September 2013

Contents

Hierarchical Linear Regressions

This Hierarchical Linear Model example walks through http://www.r-tutor.com/gpu-computing/rbayes/rhierlmc as implemented on FarmShare.

The R libraries used here can be downloaded from: http://www.r-tutor.com/content/download


installing rpud

Rpud comes in two parts. The first is a standard R library which can be installed the usual way, except we pass NVCC_OPT to instruct the code generator to produce code for the 2.0 level devices we are using. The second part is distributed as a binary blob which has an install.sh.

Here we get and install the first part into your home directory (under lib/R).

module load cuda
mkdir /tmp/$USER
cd /tmp/$USER
wget http://www.r-tutor.com/sites/default/files/rpud/rpud_0.3.3.tar.gz
mkdir ~/lib/R
export R_LIBS_USER="${HOME}/lib/R"
NVCC_OPT="-arch compute_20" R CMD INSTALL --library=$R_LIBS rpud_0.3.3.tar.gz

Here we get and install the second part:

module load cuda
cd /tmp/$USER
wget http://www.r-tutor.com/sites/default/files/rpud/rpudplus_0.3.3.tar.gz
export R_LIBS_USER="${HOME}/lib/R"
cd rpudplus_0.3.3
./install.sh


running on CPU

As a comparison, we will run on CPU using standard bayesm R library.

save following source code as rpudtestcpu.r

library(bayesm)

data("cheese")
str("cheese")

retailer <- levels(cheese$RETAILER)
nreg <- length(retailer); nreg

regdata <- NULL

for (i in 1:nreg) {
    filter <- cheese$RETAILER==retailer[i]
    y <- log(cheese$VOLUME[filter])
    X <- cbind(1,      # intercept placeholder
            cheese$DISP[filter],
            log(cheese$PRICE[filter]))
    regdata[[i]] <- list(y=y, X=X)
}

Data <- list(regdata=regdata)
Mcmc <- list(R=2000)

system.time(
outcpu <- bayesm::rhierLinearModel(
         Data=Data,
         Mcmc=Mcmc))

beta.3cpu <- mean(as.vector(outcpu$betadraw[, 3, 201:2000]))
beta.3cpu

Run it. In this case it took 42 seconds.

cp -p /farmshare/software/examples/R/hierregress/cheese.rda .
R --no-save < rpudtestcpu.r

you should see output similar to:

bishopj@rye01:~$ R --no-save < rpudtestcpu.r 

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> library(bayesm)
> 
> data("cheese")
> str("cheese")
 chr "cheese"
> 
> retailer <- levels(cheese$RETAILER)
> nreg <- length(retailer); nreg
[1] 88
> 
> regdata <- NULL
> 
> for (i in 1:nreg) {
+     filter <- cheese$RETAILER==retailer[i]
+     y <- log(cheese$VOLUME[filter])
+     X <- cbind(1,      # intercept placeholder
+             cheese$DISP[filter],
+             log(cheese$PRICE[filter]))
+     regdata[[i]] <- list(y=y, X=X)
+ }
> 
> Data <- list(regdata=regdata)
> Mcmc <- list(R=2000)
> 
> system.time(
+ outcpu <- bayesm::rhierLinearModel(
+          Data=Data,
+          Mcmc=Mcmc))
Z not specified -- putting in iota
 
Starting Gibbs Sampler for Linear Hierarchical Model
    88  Regressions
    1  Variables in Z (if 1, then only intercept)
 
Prior Parms: 
Deltabar
     [,1] [,2] [,3]
[1,]    0    0    0
A
     [,1]
[1,] 0.01
nu.e (d.f. parm for regression error variances)=  3
Vbeta ~ IW(nu,V)
nu =  6
V 
     [,1] [,2] [,3]
[1,]    6    0    0
[2,]    0    6    0
[3,]    0    0    6
 
MCMC parms: 
R=  2000  keep=  1
 
MCMC Iteration (est time to end - min) 
  100  ( 0.7 )
  200  ( 0.7 )
  300  ( 0.6 )
  400  ( 0.6 )
  500  ( 0.5 )
  600  ( 0.5 )
  700  ( 0.5 )
  800  ( 0.4 )
  900  ( 0.4 )
  1000  ( 0.4 )
  1100  ( 0.3 )
  1200  ( 0.3 )
  1300  ( 0.2 )
  1400  ( 0.2 )
  1500  ( 0.2 )
  1600  ( 0.1 )
  1700  ( 0.1 )
  1800  ( 0.1 )
  1900  ( 0 )
  2000  ( 0 )
  Total Time Elapsed:  0.71 
   user  system elapsed 
 42.288   0.036  42.448 
> 
> beta.3cpu <- mean(as.vector(outcpu$betadraw[, 3, 201:2000]))
> beta.3cpu
[1] -2.146799
>

running on GPU

save following source code as rpudtestgpu.r

library(bayesm)

data("cheese")
str("cheese")

retailer <- levels(cheese$RETAILER)
nreg <- length(retailer); nreg

regdata <- NULL

for (i in 1:nreg) {
    filter <- cheese$RETAILER==retailer[i]
    y <- log(cheese$VOLUME[filter])
    X <- cbind(1,      # intercept placeholder
            cheese$DISP[filter],
            log(cheese$PRICE[filter]))
    regdata[[i]] <- list(y=y, X=X)
}


Data <- list(regdata=regdata)
Mcmc <- list(R=2000)

library(rpud)    # load rpudplus

system.time(
outgpu <- rpud::rhierLinearModel(
	Data=Data,
	Mcmc=Mcmc,
	output="bayesm"))

beta.3 <- mean(as.vector(outgpu$betadraw[, 3, 201:2000]))
beta.3

Run it. For GPU it takes 0.35 seconds, down from 42.448 on the CPU alone.

bishopj@rye01:~$ module load cuda
bishopj@rye01:~$ export R_LIBS_USER=$HOME/lib/R
bishopj@rye01:~$ R --no-save < rpudtestgpu.r 

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> library(bayesm)
> 
> data("cheese")
> str("cheese")
 chr "cheese"
> 
> retailer <- levels(cheese$RETAILER)
> nreg <- length(retailer); nreg
[1] 88
> 
> regdata <- NULL
> 
> for (i in 1:nreg) {
+     filter <- cheese$RETAILER==retailer[i]
+     y <- log(cheese$VOLUME[filter])
+     X <- cbind(1,      # intercept placeholder
+             cheese$DISP[filter],
+             log(cheese$PRICE[filter]))
+     regdata[[i]] <- list(y=y, X=X)
+ }
> 
> 
> Data <- list(regdata=regdata)
> Mcmc <- list(R=2000)
> 
> library(rpud)    # load rpudplus
Rpudplus 0.3.3
http://www.r-tutor.com
Copyright (C) 2010-2013 Chi Yau. All Rights Reserved.
Rpudplus is free for academic use only. There is absolutely NO warranty.

Attaching package: 'rpud'

The following object(s) are masked from 'package:bayesm':

    rhierLinearModel, rmultireg

> 
> system.time(
+ outgpu <- rpud::rhierLinearModel(
+ 	Data=Data,
+ 	Mcmc=Mcmc,
+ 	output="bayesm"))
 
Starting Gibbs Sampler for Linear Hierarchical Model
    88  Regressions
    1  Variables in Z (if 1, then only intercept)
 
Prior Parms: 
===========  
Deltabar
     [,1] [,2] [,3]
[1,]    0    0    0
 
A
     [,1]
[1,] 0.01
 
nu.e (d.f. for regression error variances) = 3
 
Vbeta ~ IW(nu,V)
 
nu = 6
 
V 
     [,1] [,2] [,3]
[1,]    6    0    0
[2,]    0    6    0
[3,]    0    0    6
 
MCMC parms: 
==========  
R = 2000, keep = 1
 
....................
   user  system elapsed 
  0.192   0.156   0.350 
> 
> beta.3 <- mean(as.vector(outgpu$betadraw[, 3, 201:2000]))
> beta.3 
[1] -2.145573
>
Personal tools
Toolbox
LANGUAGES