Disease X:
Mean and median are the same.
Mean and median of a symmetric distribution
Multiple myeloma:
and a right-skewed distribution
The mean is pulled toward the skew.
Impact of skewed data
It is maybe easier to see that by comparing the two distributions we just looked at that show time to death after diagnosis.
For both disease X and MM you have on average 3 years to live.  Does that mean you don’t care which one you get?
Well, of the 25 people getting disease X, only 1 died in the first year after diagnosis.  Of the ones getting MM, 7 did.
So if you get X, according to what we see here only 1/25 or about 4 percent of people don’t make it through year one.
But if you get MM, well, if 1 in 7 die in year one, it means you have an almost 30% chance of not making it even a year.
Now, you might be one of these very few who live a long time, but it is much more likely that it is time to get your will together
and hurry around to say goodbye to your loved ones.
Means are the same, medians are different, because of the shape of the distribution.
This is one of the major take-home messages from this class - you all thought you knew what an average meant, and you did,
But you should also realized that what the average is telling you is different depending on the distribution.
When the doctor diagnoses you with some disease, and people with that disease live on average for 3 years,
You say Doctor!  Show me the distribution!
And as you go on in biology and you see charts like this in journal articles or even in the paper, you now know why they are showing them to you.
Statistical descriptors, like using the mean to describe the center, are only telling you so much.
To really understand what is going on you have to plot the data and look at the distribution for things like overall shape, symmetry, and the presence of outliers, and you have to understand the effect they have on things like the mean.

Now, the next obvious question for a biologist of course is why you see these different types of patterns.
The top is a normal distribution, represents lots of things in the natural world as we have seen in our women’s height and toucan bill examples.
The distribution on the bottom is very different, and when you see something like this it challenges researchers to understand  it - why do such a large percentage of  people die so quickly - is there one single thing that if we could figure it out would save a huge chunk of the people dying down here?  Could they figure out what it is about either these people or their treatment that allowed them to live so long?  Lots still not known but a big part of it is that this diagnosis, MM, does not have the word multiple in its name for no reason.  When you  get down to the level of the cells involved, lots of different ones - so is really a suite of diseases.  So this diagnosis is like “cancer” in general - a term that covers a broad range of biological phenomena that you can study and pick apart and understand on the cell biology to epidemiological level using not your intuition, but statistics.
Now let’s move on from describing the center to describing the spread and symmetry, which are, again, really different for these two distributions.