Causality detection based on information-theoretic approaches in time series analysis Hlavackova-Shindler Palus 2009

From enfascination

Jump to: navigation, search

<bibtex>@article{hlaváčková2007causality,

 title=Template:Causality detection based on information-theoretic approaches in time series analysis,
 author={Hlav{\'a}{\v{c}}kov{\'a}-Schindler, K. and Palu{\v{s}}, M. and Vejmelka, M. and Bhattacharya, J.},
 journal={Physics Reports},
 volume={441},
 number={1},
 pages={1--46},
 year={2007},
 publisher={Elsevier}

} </bibtex>

Great map to the literature. I found in it what I have been doing until now and why that is wrong. It was here that I finally sat down and thought through the implications, in the calucation of entropy, the implications of weighting each p value by its log, and taking the sum.

I got introduced to the Kullback-Leibler divergence (KLD) (1951) as an alternative to mutual information. "Mutual information is the KLD of the product P(X)P(Y) from the joint distribution P(X,Y)" as demonstrated in <bibtex>@article{gelman2003bayesian,

 title=Template:Bayesian data analysis. Texts in statistical science,
 author={Gelman, A. and Carlin, J.B. and Stern, H.S. and Rubin, D.B.},
 journal={Boca Raton (Florida): Chapman \& Hall/CRC Press},
 volume={200},
 pages={696},
 year={2003}

} </bibtex>

By the definition of the "norm of the mutual information" I asked "Is this the average MI added in per time step in the window"

At the bottom of p 190 is the discussion of why I should be using transfer entropy instead of mutual information.

196 discusses partioning, and other approaches besides the binning which I have been using naively.

It then talks about using learning methods, and then talks about kernel methods, which I am clueless about.

Paper wraps up, as promised, with a dicussion of Granger causality, briefly "for a pair of stationary, weakly dependant, bivariate time series X, Y, Y is a Granger cause of X if the distrution of X, given past obserrvations of X and Y, differs from the distribution of X given past observations of X only."

"Theoretically, for a good entropy estimator, the condition of consistency seems to be important." I have no idea what this means, but it looks important, no?

bibliography

<bibtex>@article{schreiber2000measuring,

 title=Template:Measuring information transfer,
 author={Schreiber, T.},
 journal={Physical review letters},
 volume={85},
 number={2},
 pages={461--464},
 year={2000},
 publisher={APS}

} </bibtex> Schrieber walks through conditional mutual information as a Markov process. [1]

<bibtex>@article{hlaváčková2007causality,

 title=Template:Causality detection based on information-theoretic approaches in time series analysis,
 author={Hlav{\'a}{\v{c}}kov{\'a}-Schindler, K. and Palu{\v{s}}, M. and Vejmelka, M. and Bhattacharya, J.},
 journal={Physics Reports},
 volume={441},
 number={1},
 pages={1--46},
 year={2007},
 publisher={Elsevier}

} </bibtex>

  • shows that transfer entropy is equivalent to conditional mutual information

<bibtex> @article{paluš1996coarse,

 title=Template:Coarse-grained entropy rates for characterization of complex time series,
 author={Palu{\v{s}}, M.},
 journal={Physica D: Nonlinear Phenomena},
 volume={93},
 number={1-2},
 pages={64--77},
 year={1996},
 publisher={Elsevier}

} </bibtex>

  • "In order to obtain an asymptotic entropy estimate onf an m-dimensiona dynamical system, large amounts of data are necessary. To avoid this, Palus proposed to compute "course-grained entropy rates" as relative measures of "information creation" and of regularity and predictability of studied processes"