INDEX
Explanations
titles of royalty, specifically the word "Emperor"
references to historical figures, specifically emperors
references to emperors and imperial figures
New Auto-Interp
Negative Logits
ter
-0.79
chel
-0.77
trak
-0.73
asket
-0.73
esville
-0.72
ties
-0.71
tera
-0.71
nel
-0.70
crew
-0.69
dos
-0.67
POSITIVE LOGITS
peror
1.07
Nero
1.04
perors
1.02
emperor
0.92
Emperor
0.92
pengu
0.90
esses
0.83
Napoleon
0.81
Claud
0.79
Augustus
0.75
Activations Density 0.008%