A list of 24 risks specific to the use of text-based AI models by children.
Each risk has the following properties:
| Parent category | The general category of risk this belongs to. Example: Physical, Health & Legal Safety | | --- | --- | | Name | The name of this specific risk. Example: Self-Harm & Eating Disorders | | Description | A general description of the harmful behavior associated with this risk. Example: Content that promotes, normalizes, or inadequately responds to suicide, self-injury, eating disorders, or harmful body-related behaviors. |
The full risk taxonomy can be found here.
The benchmark will evaluate each of these risks across the following age ranges:
| 7 to 9 | Children primarily exhibit concrete thinking and high trust in authority, making them especially vulnerable to misunderstanding consequences and over-relying on AI guidance. |
|---|---|
| 10 to 12 | Children begin developing abstract reasoning and social awareness, resulting in more ambiguous risk signals shaped by peer influence and inconsistent judgment. |
| 13 to 17 | Adolescents have greater autonomy and expressive ability, with risks often emerging explicitly but intertwined with identity exploration, emotional intensity, and social pressure. |
For each risk in the taxonomy and for each age range, scenario seeds are generated across a range of motivational profiles. This is to ensure a better distribution over the hundreds of seeds that we generate for every risk+age range combination.
GPT-4o is used for this step because it reliably produces creative, varied, and natural scenarios.
The scenario seeds are then expanded one by one into a fully fleshed scenario.
GPT-5.2 is used for this step to expand the seeds consistently and faithfully, reducing drift while preserving the original intent and structure of each scenario.