INDEX
Explanations
instances of proper nouns or names related to people or organizations
New Auto-Interp
Negative Logits
Shakspeare
-0.93
bbene
-0.89
ſhe
-0.86
%"),
-0.85
に
-0.85
Grüsse
-0.85
Houſe
-0.84
faſt
-0.84
alſo
-0.83
strix
-0.83
POSITIVE LOGITS
Me
1.11
Me
1.05
Se
0.96
Re
0.96
Se
0.94
Be
0.93
La
0.92
Re
0.92
Te
0.92
Ge
0.91
Activations Density 0.181%