statistical approach of decision when behavior is in learning process #160
hyunjimoon
started this conversation in
people; relating
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
//////////////////////////////////////////////////////////////////////
// Decisions, Decisions, Decisions
//////////////////////////////////////////////////////////////////////
// Consider a space of actions A. How do we construct a strategy that
// identifies an optimal decision?
// We need to quantify optimality with a utility function,
U : A -> R.
// We can then just take the action with the highest utility
a^dagger = argmax_{a \in A} U(a).
// In most cases utility functions will depend on behaviors we don't
// know. If those parmeters are modeled by the parameters theta \in Theta
// then we have a model-based utility function
U : A times Theta -> R.
// In this case we can no longer unambiguously rank the actions by their
// utilities because changing theta can change the rankings.
// To make a decision we need to infer the unknown behaviors from some
// data tilde{y}. We could use a point estimate
a^dagger(tilde{y}) = argmax_{a \in A} U(a, hat{theta}(tilde{y}))
// or even Bayesian inference
a^dagger(tilde{y}) = argmax_{a \in A} \int d(theta) pi(theta | tilde{y}) U(a, theta).
// Processes like these define data-dependent decision-making strategies
a^dagger: Y -> A.
// How well can we expect a decision-making strategy to work before we
// collect any data and hence before we can learn about theta?
// We don't know theta and we don't know what we will learn about theta
// because we don't know y. We need to consider what y we might see,
// what we would learn from that y, and then how good of a decision we
// would make would be. In other words we need to simulate.
// In a Bayesian analysis this is straightforward:
tilde{theta} ~ pi(theta)
tilde{y} ~ pi(y | tilde{theta})
a^dagger(tilde{y})
U(a^dagger(tilde{y}), tilde{theta})
// Note that this "calibration" procedure, also known as "fucking around
// and finding out", depends on
// 1. The choice of Bayesian model pi(y, theta) = pi(y | theta) pi(theta).
// 2. The choice of decision-making strategy a^dagger: Y -> A.
// 3. The choice of model-based utility function U : A times Theta -> R.
// If we know what kind of overall performance we want then we can tune
// any of these choices. Changing (1) is known as experimental design.
// Changing (2) is tuning the strategy. Changing (3) is setting realistic
// expectations.
Beta Was this translation helpful? Give feedback.
All reactions