INDEX
Explanations
themes related to social issues and political conflicts
New Auto-Interp
Negative Logits
ô
-0.15
Xxx
-0.15
еÑĢÑĤ
-0.14
éru
-0.14
RIEND
-0.14
PHA
-0.14
запаÑģ
-0.14
orce
-0.13
поÑĢ
-0.13
Ñģи
-0.13
POSITIVE LOGITS
(((
0.19
Establishment
0.16
,,
0.16
Garland
0.15
((((
0.15
iggers
0.15
akov
0.15
olec
0.15
)))
0.14
arro
0.14
Activations Density 0.494%