INDEX
Explanations
references to mental health conditions, particularly related to ADHD and Autism
New Auto-Interp
Negative Logits
kp
-0.07
lander
-0.06
.echo
-0.06
Wand
-0.06
ozo
-0.06
ntl
-0.06
ục
-0.06
suspended
-0.06
Susp
-0.06
porn
-0.06
POSITIVE LOGITS
Treat
0.10
treat
0.08
chemical
0.08
synd
0.07
dét
0.07
caused
0.07
treatments
0.07
science
0.07
disease
0.07
medical
0.06
Activations Density 0.009%