Discrete Random Variables
Suppose that you have a stochastic process that takes discrete values (e.g., outcomes of tossing a coin 10 times, number of customers who arrive at a store in 10 minutes etc). In such cases, we can calculate the probability of observing a particular set of outcomes by making suitable assumptions about the underlying stochastic process (e.g., probability of coin landing heads is p and that coin tosses are independent).
Denote the observed outcomes by O and the set of parameters that describe the stochastic process as θ . Thus, when we speak of probability we want to calculate P(O|θ) . In other words, given specific values for θ , P(O|θ) is the probability that we would observe the outcomes represented by O .
However, when we model a real life stochastic process, we often do not know θ . We simply observe O and the goal then is to arrive at an estimate for θ that would be a plausible choice given the observed outcomes O . We know that given a value of θ the probability of observing O is P(O|θ) . Thus, a 'natural' estimation process is to choose that value of θ that would maximize the probability that we would actually observe O . In other words, we find the parameter values θ that maximize the following function:
Continuous Random Variables
In the continuous case the situation is similar with one important difference. We can no longer talk about the probability that we observed O given θ as in the continuous case P(O|θ)=0 . Without getting into technicalities, the basic idea is as follows:
Denote the probability density function (pdf) associated with the outcomes O as: f(O|θ) . Thus, in the continuous case we estimate θ given observed outcomes O by maximizing the following function:
In this situation, we cannot technically assert that we are finding the parameter value that maximizes the probability that we observe O as we maximize the pdf associated with the observed outcomes O .
No comments:
Post a Comment