INDEX
Negative Logits
probe
-1.28
probes
-1.13
Probe
-1.10
Probes
-1.02
probe
-0.91
sclerosis
-0.84
Probe
-0.84
probed
-0.68
Autoritní
-0.65
probing
-0.61
POSITIVE LOGITS
Púb
0.65
Spin
0.59
word
0.56
^{}0.55
e
0.54
castor
0.53
heets
0.52
Seeder
0.52
mya
0.52
Mouse
0.52
Activations Density 0.991%