INDEX
Explanations
phrases related to expressing support or being supportive towards others
mentions of supportiveness and receptiveness towards individuals or groups
New Auto-Interp
Negative Logits
hid
-0.72
buck
-0.70
ixtape
-0.68
angler
-0.68
iphate
-0.68
hunt
-0.67
Tes
-0.66
hung
-0.66
hig
-0.66
hun
-0.65
POSITIVE LOGITS
enough
0.88
wcsstore
0.85
supportive
0.83
Supports
0.75
heses
0.74
minded
0.74
hetical
0.74
ative
0.73
affirm
0.70
embrace
0.70
Activations Density 0.021%