INDEX
Explanations
references to significant historical figures and events
New Auto-Interp
Negative Logits
engin
-0.16
awah
-0.16
Anglic
-0.16
sburg
-0.15
arrass
-0.15
å¹¹ç·ļ
-0.15
ë¡
-0.15
çĤ®
-0.14
bish
-0.14
andal
-0.14
POSITIVE LOGITS
Athens
0.31
Spartan
0.28
Athen
0.28
hop
0.25
Lesbian
0.25
Marathon
0.24
Greek
0.24
Greece
0.23
Lesb
0.23
hop
0.22
Activations Density 0.050%