INDEX
Explanations
phrases related to societal issues and controversies
New Auto-Interp
Negative Logits
uay
-0.83
ntil
-0.81
Murd
-0.78
rick
-0.78
oil
-0.74
ellen
-0.73
achus
-0.72
ocene
-0.72
icka
-0.72
Diamond
-0.71
POSITIVE LOGITS
matters
0.83
drastic
0.81
ities
0.81
ties
0.81
mundane
0.78
blatant
0.76
kinds
0.74
minded
0.74
things
0.74
sentiments
0.73
Activations Density 0.268%