### Author: Niklas Norén

### Abstract

Dependency derivation is the search for combinations of variables (or
states of variables) in a database, that co-occur unexpectedly often.
In Bayesian dependency derivation, indications are ranked primarily by
their estimated strengths, but an adjustment is made to account for
uncertainty when data is scarce. This reduces the risk of highlighting
spurious associations.

This report presents refined methods for *IC* analysis---one method
for Bayesian dependency derivation. The disproportionality measure in
*IC* analysis is the
Information Component (*IC*)
[EJCP,54(4):315-321,1998].
It relates the observed joint frequency of two particular states of two
different variables to the frequency expected under the assumption of
independence.

In the current implementation of *IC* analysis, estimates for the
lower 95% credibility interval limit are derived based on a normal
approximation to the posterior *IC* distribution
[CSDA,34(4):473-493,2000].
In this report, the validity of these approximations is examined
through Monte Carlo simulation. Monte Carlo simulation is also
proposed and used as a general tool to study the *IC*
distribution.

For accurate lower credibility interval limit derivation over the
entire domain of possible parameter values, two Monte Carlo based
approaches are proposed: brute force simulation and a tabular method.
These methods vary in execution time and the ranges in which they give
accurate results. The optimal combination and implementation of the
known approaches is highly dependent on characteristics of the
database of interest.

Furthermore, this report shows that for a certain choice of
non-informative priors the multinomial and the Poisson data models
yield equivalent posterior *IC* distributions and that Monte Carlo
simulation under these circumstances is equivalent to the Bayesian
bootstrap.

Relevant aspects of the multiple comparisons issue and problems
related to stratification and confounding variables are also
discussed.

Niklas Norén
Last modified: Sun Feb 16 23:26:30 CET 2003