INDEX
Explanations
proper nouns related to historical or literary figures
names of historical figures, particularly those from ancient Rome and related contexts
New Auto-Interp
Negative Logits
arten
-0.94
lag
-0.89
tw
-0.87
rien
-0.83
caster
-0.80
eals
-0.80
need
-0.79
WAY
-0.77
taboola
-0.77
ynthesis
-0.75
POSITIVE LOGITS
Caesar
1.45
Augustus
1.18
Claud
1.15
Britann
0.99
Nero
0.98
Romans
0.93
amph
0.90
Gaul
0.87
Titus
0.86
olini
0.83
Activations Density 0.012%