INDEX
Explanations
terms and phrases that relate to political and social struggles
New Auto-Interp
Negative Logits
''
-0.97
-0.90
'''
-0.85
Перейти
-0.81
''
-0.79
\'
-0.77
.''
-0.77
\'
-0.76
.^
-0.75
'')
-0.72
POSITIVE LOGITS
–
2.99
–,
1.88
–
1.34
-
1.26
−
1.24
-,
1.23
$-$
1.22
)–
1.15
--
1.13
,–
1.06
Activations Density 0.380%