INDEX
Explanations
references to Serbia and Serbian identity
New Auto-Interp
Negative Logits
tar
-0.17
omor
-0.16
reau
-0.15
DUCT
-0.15
ober
-0.15
caption
-0.15
gw
-0.15
eding
-0.14
tras
-0.14
äd
-0.14
POSITIVE LOGITS
ancial
0.15
Frost
0.15
Altern
0.14
izm
0.14
615
0.14
monster
0.14
Frm
0.14
/Dk
0.13
oky
0.13
Neu
0.13
Activations Density 0.004%