INDEX
Explanations
references to noble titles and historical figures
New Auto-Interp
Negative Logits
439
-0.17
iber
-0.15
ands
-0.15
enberg
-0.15
opposite
-0.14
âĢĮ
-0.14
anda
-0.14
Suspension
-0.14
eced
-0.14
.same
-0.14
POSITIVE LOGITS
Lord
0.27
Prince
0.21
Lord
0.21
lord
0.19
Governor
0.18
Master
0.18
LORD
0.18
King
0.18
Baron
0.17
Bishop
0.16
Activations Density 0.075%