INDEX
Explanations
titles or formal titles referencing individuals, particularly those with the honorific "Sir."
New Auto-Interp
Negative Logits
itſelf
-1.02
Efq
-0.92
houſe
-0.90
pleaſure
-0.88
Искәрмәләр
-0.87
purpoſe
-0.86
whoſe
-0.82
myſelf
-0.81
ſche
-0.81
Mako
-0.81
POSITIVE LOGITS
Sir
1.39
SIR
1.32
SIR
1.22
Sir
1.17
Sirs
0.98
Sira
0.94
sir
0.86
CORBA
0.72
Siro
0.68
Laird
0.67
Activations Density 0.004%