INDEX
Explanations
references to historical figures or entities associated with ancient Rome
references to Roman themes or contexts
New Auto-Interp
Negative Logits
laun
-0.76
FH
-0.73
Ellison
-0.72
kWh
-0.71
APD
-0.69
ufact
-0.69
DG
-0.66
haw
-0.66
APP
-0.66
heed
-0.64
POSITIVE LOGITS
Roman
3.71
Roman
2.87
Romans
2.25
Rome
1.72
Romanian
1.69
Greek
1.53
roman
1.48
Byzantine
1.46
Greek
1.42
Caesar
1.40
Activations Density 0.006%