INDEX
Explanations
proper nouns, particularly names of individuals
New Auto-Interp
Negative Logits
purpoſe
-1.13
itſelf
-1.11
pleaſure
-1.11
Theſe
-1.10
ſelf
-1.04
Monfieur
-1.02
poffible
-1.01
myſelf
-1.01
Jefus
-0.99
Cæsar
-0.99
POSITIVE LOGITS
John
0.53
ra
0.52
Sa
0.51
sa
0.48
0.48
´
0.47
senior
0.46
ju
0.46
A
0.45
r
0.45
Activations Density 0.616%