INDEX
Explanations
occurrences of the name "John."
New Auto-Interp
Negative Logits
itious
-0.18
onet
-0.17
à¸ģาร
-0.17
onse
-0.15
esus
-0.15
}elseif
-0.15
lect
-0.15
gaard
-0.15
orget
-0.15
urity
-0.14
POSITIVE LOGITS
athan
0.48
nie
0.40
athon
0.38
atan
0.31
stone
0.30
ston
0.28
ny
0.24
nym
0.24
annes
0.24
ni
0.22
Activations Density 0.032%