Software implementing 1PFS

(1 Pass Filtering with Shrinkage)

Suhrid Balakrishnan and David Madigan

Last update: Mar 23, 2004

Overview

This software implements the One Pass Sequential Monte Carlo method (1PFS, Balakrishnan and Madigan, 2004) and allows users to independently verify the results obtained in the accompanying manuscript.

MATLAB (r13) code is provided for various aspects of the two case studies presented in the paper. Unfortunately, the dataset used in the paper for Example I (the fully Bayesian logistic regression) is proprietary, so users will not be able to check our results on that. Example II (the mixture of Markov chains case study) uses simulated data and a data generator is also provided.

User Guide

The code for both examples is bundled separately (although they use a lot of common code), the main reason for this being to assist users who are only interested in implementing one particular case study. The code is self contained except for some random variable generation routines, for which users are required to download and install (at least some files from) Tom Minka’s lightspeed toolbox (for MATLAB). The code consists of a small number m-files which hopefully contain enough documentation to be readily usable.

There are three main components to being able to successfully obtain results similar to ours for either of the examples. 1) Generate the data 2) Obtain the exact posterior parameter particles (MCMC routines are provided for this task) on a small initial portion of the dataset 3) Run the 1PFS algorithm on the remaining data to obtain final posterior parameter particles. 

1.      Generating the data – As stated previously, Example I uses proprietary data so unfortunately users will need to use their own datasets for evaluation. The dataset should be an ASCII flatfile (or any MATLAB convenient format) with each row containing the observed value of the binary predictor variable followed by the vector of regression variables. E.g. : 0 -0.272 -0.174 -0.548 0.965 -1.003 -0.409 -0.460 1.675 -0.764

For Example II, a data generating routine is provided (generate_data.m).

2.      Obtain the exact posterior parameter particles for a small portion of the generated data (the starting point for 1PFS) – a routine provided (MCMC_on_D.m ) performs this task for the examples. Note that the MCMC parameters, size of this small portion etc. are all tunable via this routine – you can also run the chain till however long you please, thin the drawn particles etc.

In our implementation, after this step is carried out, a mat-file saves the obtained particles. This allows flexibility in tuning 1PFS while starting from the same initial set of particles.

3.      Running 1PFS – Both Examples employ MATLAB main program scripts (1PFS_mainprog.m) to run 1PFS after an initial set of particles is obtained (as per the previous step). The parameters (where to start reading the full data, how big the blocks of data that will be processed together etc.) can all be set within this file. One detail in the implementation perhaps not explicit in the manuscript is that we process blocks of observations (from the dataset) at each iteration (rather than a single row of the dataset). Further, if users are interested in truly large datasets for Example I (where the data cannot be stored in memory), it is suggested that the data read mechanism from Example II be utilized (only the relevant block of observations is stored in memory).

Wherever there is a lightspeed toolbox dependency, it has been noted in the corresponding m-file. 

Directory Listing

            Example I (Fully Bayes Logistic Regression)

·        1PFS_mainprog.m          

·        kitagawa_resample_move.m 

·        wt_mean.m

·        MCMC_on_D.m              

·        mainprog_ess_kitagawa.m

·        importance_sample.m       

·        myunifrnd.m

            Example II (Mixture of Markov Chains)

·        1PFS_mainprog.m         

·        generate_data.m         

·        read_data_block.m

·        MCMC_on_D.m             

·        importance_sample.m     

·        rnd_dirichlet.m

·        cluster_cond_sum.m      

·        kitagawa_resample.m     

·        weighted_sample.m

·        dirichlet_move.m        

·        my_gamrnd.m             

·        wt_mean.m

·        evaluate_ll.m           

·        particles_to_matrices.m

Download and Installation

The software is available in tarred-zipped form.

·        Download files for Example I

·        Download files for Example II

In order to use, just unzip and extract. Most m-files have internal comments that should be available when typing help in MATLAB.

Acknowledgments

We thank Greg Ridgeway and David Scott for helpful discussions.

Legal Notice

Software is free for non-commercial use.

Software is provided as is, without any guarantee. Authors are not responsible for implications from the use of this software.

Feedback

Questions, comments, bug reports welcome by email: suhrid@paul.rutgers.edu

Free Counter
Site Counter here