Van Dantzig Seminar

nationwide series of lectures in statistics

Home      David van Dantzig      About the seminar      Upcoming seminars      Previous seminars      Slides      Contact    

Van Dantzig Seminar: 12 December 2013

Programme: (click names or scroll down for titles and abstracts)

14:00 - 14:05 Opening
14:05 - 15:05 Pascal Massart (Université de Paris-Sud, Orsay)
15:05 - 15:25 Break
15:25 - 16:25 Joris Mooij (University of Amsterdam and Radboud University Nijmegen)
16:30 - 17:30 Reception
Location: VU University Amsterdam, W&N Building, De Boelelaan 1085, Room C6.69 (Directions)

Titles and abstracts

  • Pascal Massart

    Data driven penalties

    Model selection is a classical topic in statistics. The idea of selecting a model via penalizing a log-likelihood type criterion goes back to the early seventies with the pioneering works of Mallows and Akaike. One can find many consistency results in the literature for such criteria. These results are asymptotic in the sense that one deals with a given number of models and the number of observations tends to infinity. We shall give an overview of a non asymtotic theory for model selection which has emerged during these last fifty years. In various contexts of function estimation it is possible to design penalized log-likelihood type criteria with penalty terms depending not only on the number of parameters defining each model (as for the classical criteria) but also on the complexity of the whole collection of models to be considered. For practical relevance of these methods, it is desirable to get a precise expression of the penalty terms involved in the penalized criteria on which they are based. Concentration inequalities (the prototype being Talagrand's inequality for empirical processes) lead to non asymptotic risk bounds for the corresponding penalized estimators showing that they perform almost as well as if the best model (i.e. with minimal risk) were known. They also provide explicit shapes for penalties. However, typically some tuning multiplicative constant remains to be chosen by the user. Our purpose will be to give an account of the theory, discuss some heuristics to choose this multiplicative constant from the data and provide some results validating this data-driven strategy.

  • Joris Mooij

    Causal Modeling of Feedback Systems

    In this talk, I will discuss some recent developments on causal modeling and causal discovery from data within the context of systems that may have feedback loops. The talk will start with a general introduction to the topic of causal inference, a branch of statistics and machine learning. Next, Structural Causal Models (SCMs) will be discussed, a generalization of Bayesian Networks that enable modeling causal feedback loops (cycles). I will present our recent result that says that the equilibrium states of systems of ordinary differential equations can be described by SCMs, thereby giving a new interpretation of cyclic SCMs. Then, I will discuss how SCMs can be used to infer causal relationships from data. A challenging application on a protein expression data set will be discussed in detail, which shows that cyclic SCMs are a promising alternative for Bayesian networks. I will conclude with an outlook on future research in this area.

    Download the slides

Supported by

BTK, Amsterdam 2013