*New England Journal of Medicine*(

*NEJM*), showed that the older children within each grade are about 30% less likely to be labeled as having attention deficit–hyperactivity disorder (ADHD).

Most U.S. school systems group children together in one-year cohorts
based on a cutoff date, usually August 31 / September 1. For those school systems, the

*NEJM*article looked at rates of ADHD diagnosis for all of the children, grouped by month of birth. The analysis primarily compared ADHD rates for adjacent months, as here:
The graphic above shows that the rate difference between August-born
children and September-born children is statistically significant (p < .05;
note the 95% error bar clearing the dotted “zero” line), but that no other
adjacent months show a statistically significant difference.

I believed that one could show stronger evidence from a more holistic
look at the data. Using the table of
data from above, I made a graph using r. In the graph below, blue columns show the rate of ADHD diagnosis by birth
month. The oldest students, at left,
have birthdays in September. The graph also
shows a red curved regression line, and orange 95% error bars for each month, based on a
binomial distribution on each month's sample size.

To put this in narrative form, it is not so much that the youngest
(August birthday) children have elevated ADHD rates, as that the older half of
the class on the left has increasingly lower ADHD rates. It appears that about a third of the oldest
have matured out of the level of behavior which would result in an ADHD
diagnosis. Teachers and pediatricians
might wish to take this into account especially before concluding that a child
in the younger half of his class has ADHD, at least in borderline cases.

The younger half of the class at right shows a less clear trend. This nonlinearity is shown by the curved
regression line, which is upward sloping and downward curving. Of course, humans make note of patterns, and
random effects may look like a pattern.
To calculate whether these patterns are statistically significant, a
regression looking at both the linear and squared features showed strong
significance, with

*p*< .001 for the upward sloping linear feature, and*p*= .001 for the squared feature (the downward curve). Further analysis, considering that the actual statistical deviation of the measured samples is smaller than their apparent deviation compared to each other, brought*p*<< .001.
Recent Twitter correspondence with coauthor Timothy Layton provided a
plausible explanation for the flattening on the right side of the graph:
Children born in the summer are more likely to be held back a year, and thus to
become the oldest children in a new cohort – especially if they exhibit less
mature behavior. This holding back may
replace an ADHD diagnosis as a solution to behavioral issues, and/or may reduce
later ADHD diagnoses as the child is now compared to a younger, less mature
cohort.

Apart from what the data is about in this case, this analysis presented
some interesting exercises for understanding the use of data:

- that is categorized or grouped by range;
- where the sampling error of the measured samples is smaller than their apparent deviation when compared to each other; and
- where Monte Carlo simulations may prove helpful.