Giles Hooker

Giles Hooker
Assistant Professor:

Department of Statistical Science
Department of Biological Statistics and Computational Biology

giles [dot] hooker [at] cornell [dot] edu

1186 Comstock Hall
Cornell University
Ithaca, NY 14853

Phone: (1 607) 255 1638
Fax: (1 607) 255 4698

Curriculum Vitae

Research Interests:

  • Data analysis for dynamical systems and differential equations
  • Functional data analysis
  • Machine learning and data mining
  • My research focusses on a number of issues within these three fields. I am particularly interested in developing and extending the methods of functional data analysis for examining the evolution of systems in terms of nonlinear differential equations. This involves estimating parameters for such equations, diagnosing when and why equations do not fit data well and developing statistical theory to account for smooth perturbations of such systems.

    Within functional data analysis, I also work on projects involving the estimation of several dependent functional quantities. A particular example is the estimation of a regression line as well as a density function for its residuals. When and how one introduces smoothness penalties in such contexts is an open question. These methods fall under a concept that Jim Ramsay has labelled Parameter Cascades and are also used in our work on estimating differential equations.

    In machine learning, I focus on the problem of diagnostics and understanding the prediction functions that machine learning produces. Recent work in this includes estimates for conditional density and quantile functions. I am also interested in analyzing the results of experiments in machine learning in terms of determining when and why particular methods were successful.

    Teaching:

    BTRY 6150: Applied Functional Data Analysis, Fall 2008.

    CSCU Workshop: Introduction to Functional Data Analysis, March 27, 28, 2008.

    BTRY 694: Theory of Multivariate Statistics, Spring 2008

    BTRY 694: Statistical Learning Theory, Fall 2007

    BTRY 694: Functional Data Analysis, Spring 2007

    Publications:

    Giles Hooker and Saharon Rosset, 2008. "Prediction-Focussed Regularization Using Data-Augmented Regression". Submitted.

    Giles Hooker, Matthew Finkelman and Armin Schwartzman, 2008, "Paradoxical Results in Multidimensional Item Response Theory". Submitted. Slides from a recent talk.

    Matthew Finkelman, Giles Hooker and Jane Wang, 2007, "Unidentifiability and Lack of Monotonicity in the Multidimensional Three-Parameter Logistic Model". Submitted.

    Giles Hooker, 2008. "Forcing Function Diagnostics for Nonlinear Dynamics". Biometrics, accepted.

    Giles Hooker and Larry Biegler, 2007. "IPOPT and Neural Dynamics: Tips, Tricks and Diagnostics", Technical Report BU-1676-M, Department of Biological Statistics and Computational Biology, Cornell University. A demonstration bundle provides data and AMPL code from this estimation.

    James Ramsay, Giles Hooker David Campbell and Jiguo Cao, 2007. "Parameter Estimation for Differential Equations: A Generalized Smoothing Approach". Journal of the Royal Statistical Society Vol 69 No 5, (with discussion). See the profiling webpages for Matlab code, manuals and webpage demonstrations.

    Giles Hooker, 2007, "Theorems and Calculations for Smoothing-based Profiled Estimation of Differential Equations", Technical Report BU-1671-M, Department of Biological Statistics and Computational Biology, Cornell University.

    Giles Hooker and James O. Ramsay, 2005. "Learned-Loss Boosting." Submitted. Matlab software is also available.

    Giles Hooker, 2007. "Generalized Functional ANOVA Diagnostics for High Dimensional Functions of Dependent Variables". Journal of Computational and Graphical Statistics. Vol. 16, No 3.

    Robert Norris, Jessica Ngo, Karen Nolan and Giles Hooker, 2005. "Volunteers are Unable to Properly Apply Pressure Immobilization in a Simulated Snakebite Scenario". Journal of Wilderness and Environmental Medicine, Vol 16, No 1.

    Armin Schwartzman, Matthew Finkelman and Giles Hooker, 2004. "The Stanford Statistics Songbook: A Musical Tribute". Technical Report, Department of Statistics, Stanford University.

    Giles Hooker, 2004. "Diagnostics and Extrapolation in Machine Learning". PhD Thesis, Department of Statistics, Stanford University.

    Giles Hooker and Matthew Finkelman, 2004. "Sequential Analysis for Learning Modes of Browsing". WEBKDD 2004: Proceedings of the Sixth International Workshop on Knowledge Discovery from the Web.

    Giles Hooker, 2004. "Diagnosing Extrapolation: Tree-Based Density Estimation". Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

    Giles Hooker, 2004. "Discovering ANOVA Structure in Black Box Functions". Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

    Giles Hooker and Fuliang Weng, 2003. "Subset Selection in Large, Sparse Systems: An application of the Forward Stagewise approach to Natural Language Processing". Technical Report, Robert Bosch Corp.

    Michael Shirts, Eric Bair, Giles Hooker and Vijay Pande, 2003. "Equilibrium Free Energies from Nonequilibrium Estimates Using Maximum Likelihood Methods". Physical Letters Review. Vol 91, No 14.

    Giles Hooker, 1999. "Developing a Spline Smoothed Density". Honours Thesis, Department of Mathematics, Australian National University.

    Markus Hegland, Giles Hooker and Stephen Roberts, 1999. "Finite Element Thin Plate Splines in Density Estimation". Computational Techniques and Applications: Proceedings of the Ninth Biennial Conference: CTAC99. Journal of the Australian Mathematical Society, Series B (special issue).

    Prospective Publications:

    These are a range of paper ideas, some of them more likely to turn into papers than others, some of them larger projects than others, that I think worthwhile. Anybody who is interested in them, has relevant data, or knows of authors that have beaten me to it is highly encouraged to contact me. Many of these are also potential graduate student projects.

    "Testing for Missing Components in Nonlinear Dynamics". Combines the ideas from "Forcing Function Diagnostics" with techniques from Chaotic Data Analysis to discover when a system is specified as being too low order.

    "Boosting for Conditional Density and Quantile Estimates": describes a boosting scheme to estimate the conditional density of a response given features at each point in feature space; extensions to directly estimating quantile functions are possible.

    "Disparity Estimation in Nonlinear and Nonparametric Regression". Considers ways to perform Hellinger-distance and other disparity-based estimation for nonlinear regression and extensions to non-parametric regression. With Anand Vidyashankar.

    "Functional Multiple Linear Regression; Convolution and Model Selection". Looks at performing model selection in the presence of many functional predictors. With Oliver Gao.

    "Semi-Parametric Boosting": generalizes the ideas in "Learned-Loss Boosting" to a semi-parametric context in which there is an infinite dimensional non-parametric component. Not clear how well this would work, but worth a try.

    "Experiments in Extrapolation and Truncation": Considers a number of post-hoc truncation methods to deal with extrapolation. I will set up some careful experiments and use real-world data to determine what aspects of extrapolation are most salient.

    "A Tale of Two ANOVAs": suppose we have some set of prediction functions for the same task and we want to evaluate where those functions differ. These differences can be measured point-wise using the functional ANOVA in the sense of Ramsay and Silverman. These point-wise differences then define a high dimensional function that may be investigated through a functional ANOVA in the sense of Gu and Wahba. Together, they may provide some insights into what distinguishes the output of different machine learning algorithms.