INDEX
Negative Logits
itſelf
-1.27
Efq
-1.16
Houſe
-1.10
Jefus
-1.05
Majefty
-1.03
―――――
-1.02
Theſe
-1.02
pleaſure
-1.00
Diſ
-0.98
Eſ
-0.97
POSITIVE LOGITS
↵↵
0.76
all
0.71
↵
0.70
“
0.69
’
0.67
'
0.66
“
0.65
"
0.63
The
0.62
0.60
Activations Density 0.086%