INDEX
Explanations
occurrences of references to people and their respective identifiers
New Auto-Interp
Negative Logits
åıį
-0.18
WER
-0.16
hardt
-0.16
NewProp
-0.15
ubby
-0.14
heed
-0.14
Xxx
-0.14
ioc
-0.14
.twig
-0.14
ullet
-0.14
POSITIVE LOGITS
warm
0.15
Mes
0.14
beg
0.14
Beg
0.14
div
0.13
fle
0.13
directive
0.13
insp
0.13
ble
0.13
ich
0.13
Activations Density 0.002%