INDEX
Negative Logits
itſelf
-1.02
iſt
-0.95
Anſ
-0.94
Monfieur
-0.94
BibitemShut
-0.91
themſelves
-0.90
Houſe
-0.90
myſelf
-0.90
ſever
-0.89
ſelves
-0.89
POSITIVE LOGITS
a
0.73
e
0.63
C
0.52
i
0.51
o
0.51
forma
0.49
ly
0.48
tf
0.47
id
0.47
R
0.47
Activations Density 1.667%