INDEX
Explanations
words related to societal issues and actions, particularly negative ones
terms associated with political or societal issues
New Auto-Interp
Negative Logits
ĪĴ
-0.73
ertodd
-0.71
McGr
-0.71
ĸļ
-0.68
uala
-0.61
awaru
-0.60
å°Ĩ
-0.58
å§«
-0.58
Edited
-0.57
Maxwell
-0.57
POSITIVE LOGITS
portion
0.90
aspect
0.80
component
0.80
axis
0.79
impulse
0.79
facade
0.78
element
0.74
notion
0.74
impulses
0.72
iest
0.72
Activations Density 0.774%