INDEX
Explanations
terms related to advocacy, activism, and social issues
terms related to formal procedures and governance
New Auto-Interp
Negative Logits
unst
-0.62
solicitor
-0.61
heterogeneity
-0.58
olyn
-0.56
narrowed
-0.56
ymm
-0.56
byn
-0.54
folio
-0.54
charism
-0.53
æĸ¹
-0.53
POSITIVE LOGITS
ynthesis
0.89
udic
0.75
urized
0.75
eting
0.74
0.73
ibly
0.73
urable
0.71
netflix
0.71
eval
0.70
bol
0.70
Activations Density 0.925%