INDEX
Explanations
the name "Gordon" and its variations in the text
New Auto-Interp
Negative Logits
etu
-0.17
nej
-0.16
olation
-0.16
Ku
-0.15
atoi
-0.15
itia
-0.15
illery
-0.14
anning
-0.14
sst
-0.14
æĮ¯
-0.14
POSITIVE LOGITS
swer
0.20
wart
0.17
ziej
0.16
ign
0.16
vale
0.16
mixed
0.15
Others
0.15
iag
0.14
éné
0.14
enthal
0.14
Activations Density 0.005%