INDEX
Explanations
references to royalty or imperial figures
references to emperors
New Auto-Interp
Negative Logits
rental
-0.72
BIL
-0.68
LOC
-0.67
swing
-0.65
Cheryl
-0.64
vers
-0.64
canv
-0.63
cas
-0.63
pickup
-0.62
rented
-0.62
POSITIVE LOGITS
Emperor
3.76
emperor
2.95
Empress
2.35
peror
2.28
perors
1.57
Napoleon
1.41
Buddha
1.39
Empire
1.35
Pope
1.33
Vader
1.32
Activations Density 0.023%