INDEX
Explanations
the word "Anyone" in sentences
references to individuals with the word "anyone."
New Auto-Interp
Negative Logits
ories
-0.74
bows
-0.74
urations
-0.70
iger
-0.67
itals
-0.66
ffic
-0.66
irth
-0.66
ulk
-0.65
ÃŁ
-0.63
ritz
-0.62
POSITIVE LOGITS
else
1.87
Else
1.29
Else
1.27
else
1.26
THING
0.95
who
0.93
imaginable
0.90
soever
0.89
doubted
0.89
20439
0.89
Activations Density 0.038%