INDEX
Explanations
phrases related to societal issues or events often associated with controversy or public attention
terms related to public sentiment or social commentary
New Auto-Interp
Negative Logits
constitu
-0.58
jun
-0.56
syn
-0.56
DISTR
-0.56
NB
-0.56
ãĥĩãĤ£
-0.56
Slovakia
-0.55
�
-0.54
epis
-0.54
Rept
-0.54
POSITIVE LOGITS
lime
0.71
gain
0.69
mosp
0.68
boarding
0.65
pret
0.64
sectional
0.63
gling
0.62
eral
0.61
pless
0.59
thereal
0.59
Activations Density 1.356%