INDEX
Explanations
phrases related to showing support for or against various causes or individuals
phrases that express support for various causes or groups
New Auto-Interp
Negative Logits
ĸļ
-0.69
photographed
-0.68
fing
-0.66
fx
-0.65
llo
-0.65
orp
-0.64
ashtra
-0.64
Tracker
-0.63
iae
-0.62
ways
-0.62
POSITIVE LOGITS
embattled
0.75
Mandatory
0.72
thood
0.72
charities
0.69
vested
0.68
alternative
0.68
gotten
0.67
independence
0.66
aut
0.65
ideals
0.65
Activations Density 0.117%