INDEX
Explanations
references to imperialism and its related historical contexts
New Auto-Interp
Negative Logits
abouts
-0.17
istan
-0.15
ãĥĨãĥ«
-0.15
wyn
-0.15
bjerg
-0.15
ãĥ©ãĥ³ãĥī
-0.15
erot
-0.15
365
-0.14
åĦ¿
-0.14
nett
-0.14
POSITIVE LOGITS
wide
0.20
-wide
0.19
iyet
0.15
871
0.14
/Common
0.14
éĻħ
0.14
Wide
0.14
/Foundation
0.14
ulton
0.13
ople
0.13
Activations Density 0.028%