INDEX
Negative Logits
ect
-0.16
bah
-0.15
arez
-0.15
Higher
-0.15
eker
-0.14
emma
-0.14
URRED
-0.14
ther
-0.14
higher
-0.14
ocker
-0.14
POSITIVE LOGITS
old
0.71
-old
0.59
old
0.52
olds
0.52
.old
0.46
OLD
0.42
olds
0.40
(old
0.40
_old
0.40
Old
0.40
Activations Density 0.032%