INDEX
Explanations
terms related to ancient Rome
references to the Roman context or culture
New Auto-Interp
Negative Logits
intosh
-1.10
mble
-0.94
ramid
-0.87
*/(
-0.86
olulu
-0.84
jri
-0.83
anwhile
-0.82
NetMessage
-0.82
lessly
-0.82
ickr
-0.81
POSITIVE LOGITS
Catholic
1.03
Reign
0.87
numer
0.87
Roman
0.86
Catholicism
0.86
Torch
0.85
Catholics
0.81
Emperor
0.78
Inquisition
0.78
Pont
0.77
Activations Density 0.014%