INDEX
Explanations
phrases related to social issues and political debates
New Auto-Interp
Negative Logits
Siber
-0.64
Nav
-0.64
translation
-0.60
WIND
-0.58
details
-0.57
Sequ
-0.57
Lv
-0.56
ãĤª
-0.56
ãĤŃ
-0.54
Accessed
-0.54
POSITIVE LOGITS
deserve
0.87
tended
0.85
were
0.84
fame
0.81
are
0.80
have
0.76
succumbed
0.76
ought
0.76
contend
0.76
rejoice
0.75
Activations Density 2.585%