INDEX
Explanations
references to individuals named "El" or similar variations, likely focusing on names and titles
New Auto-Interp
Negative Logits
unci
-0.17
ously
-0.16
rome
-0.15
oples
-0.15
antino
-0.15
egr
-0.15
Aspect
-0.15
ilion
-0.14
emi
-0.14
yn
-0.14
POSITIVE LOGITS
arith
0.17
abeth
0.15
tiler
0.15
uding
0.15
ATEST
0.14
asti
0.14
/>\
0.14
odie
0.14
usive
0.14
rit
0.14
Activations Density 0.065%