INDEX
Explanations
references to historical figures and events
New Auto-Interp
Negative Logits
iland
-0.15
á»ģn
-0.15
abyrin
-0.15
IFn
-0.14
mbH
-0.14
raphics
-0.14
åľ
-0.14
Ñģказ
-0.13
locks
-0.13
Orth
-0.13
POSITIVE LOGITS
Roman
0.40
Roman
0.35
Rome
0.35
Romans
0.31
senator
0.30
Pompe
0.30
roman
0.29
Caesar
0.28
Roma
0.28
Senator
0.27
Activations Density 0.113%