Joint Entropy and Conditional Entropy
The joint entropy of a pair of discrete random variables X, Y ~ p(x,y) is the amount of information needed on average to specify both their values.
H(X,Y) = - ?x?X ?y?Y p(x,y)log2p(X,Y)
The conditional entropy of a discrete random variable Y given another X, for X, Y ~ p(x,y), expresses how much extra information you still need to supply on average to communicate Y given that the other party knows X.
H(Y|X) = - ?x?X ?y?Y p(x,y)log2p(y|x)
Chain Rule for Entropy: H(X,Y)=H(X)+H(Y|X)