INDEX
Explanations
references to individuals named Thomas or Tom
New Auto-Interp
Negative Logits
Alman
-0.16
asca
-0.14
ipay
-0.13
ubby
-0.13
obao
-0.13
ipse
-0.13
ynes
-0.13
voie
-0.13
imes
-0.13
nds
-0.13
POSITIVE LOGITS
islav
0.17
Reuters
0.16
egal
0.15
othy
0.15
zilla
0.14
нав
0.14
uct
0.14
ors
0.14
maz
0.14
ppe
0.14
Activations Density 0.019%