INDEX
Explanations
references to specific historical years and events related to political history
New Auto-Interp
Negative Logits
alfa
-0.18
ptron
-0.17
ellers
-0.17
leta
-0.16
anus
-0.16
æĽľ
-0.16
ãĥĥãĤ°
-0.16
yth
-0.15
ioned
-0.15
ippet
-0.15
POSITIVE LOGITS
achts
0.18
lies
0.18
maal
0.16
aten
0.16
ers
0.14
rtl
0.14
ioni
0.14
ves
0.14
лÑİ
0.14
ries
0.14
Activations Density 0.016%