INDEX
Explanations
phrases expressing skepticism or criticism in political contexts
New Auto-Interp
Negative Logits
еÑģÑĤи
-0.15
ç±
-0.15
aten
-0.15
uffy
-0.15
alu
-0.15
uci
-0.15
Osborne
-0.14
apol
-0.14
foy
-0.14
thụ
-0.14
POSITIVE LOGITS
loe
0.16
ãĥªãĥ³
0.16
Package
0.15
=:
0.14
irs
0.14
instead
0.14
pow
0.13
ucher
0.13
ilate
0.13
ега
0.13
Activations Density 0.214%