INDEX
Explanations
references to empires and imperial authority
New Auto-Interp
Negative Logits
JAV
-0.63
Wit
-0.60
ear
-0.59
Bot
-0.57
Daryl
-0.54
Thom
-0.54
Greenway
-0.53
jav
-0.53
gat
-0.53
elen
-0.53
POSITIVE LOGITS
Empire
2.90
Empire
2.69
empire
2.54
EMPIRE
2.46
empire
2.22
empires
2.16
Empires
1.97
Imperial
1.89
imperial
1.86
Imperial
1.78
Activations Density 0.065%