INDEX
Explanations
instances of the word "autistic" or variations of it
terms related to autism
New Auto-Interp
Negative Logits
mary
-0.78
liam
-0.70
Soda
-0.69
Else
-0.68
ORK
-0.66
FIRE
-0.65
Beir
-0.64
BE
-0.63
ç«
-0.63
Water
-0.60
POSITIVE LOGITS
umn
1.05
ilus
1.02
emort
0.99
opsy
0.95
onomous
0.95
aut
0.91
obil
0.91
ica
0.86
iliary
0.86
olog
0.86
Activations Density 0.005%