INDEX
Explanations
phrases related to political support and public opinion
New Auto-Interp
Negative Logits
-valu
-0.15
iare
-0.15
à¸Ľà¸£à¸°à¸Īำ
-0.14
ETHER
-0.14
inaire
-0.14
_Enter
-0.14
iasi
-0.13
eydi
-0.13
WK
-0.13
Payload
-0.13
POSITIVE LOGITS
support
0.65
Support
0.53
support
0.50
Support
0.46
backing
0.46
SUPPORT
0.44
æĶ¯æĮģ
0.44
upport
0.43
_support
0.43
-support
0.41
Activations Density 0.207%