INDEX
Explanations
words related to education and sports
New Auto-Interp
Negative Logits
itſelf
-1.32
Theſe
-1.27
Efq
-1.27
myſelf
-1.23
ſhe
-1.21
himſelf
-1.19
Monfieur
-1.18
ſeveral
-1.16
whoſe
-1.16
ſever
-1.15
POSITIVE LOGITS
↵↵
0.98
↵
0.91
0.68
(
0.63
1
0.63
'
0.63
I
0.60
2
0.59
*
0.58
I
0.58
Activations Density 0.961%