INDEX
Explanations
words related to societal issues and policies
instances of the phrase "like," indicating comparisons or examples related to various topics
New Auto-Interp
Negative Logits
atform
-0.79
ells
-0.79
essing
-0.78
arbon
-0.77
enary
-0.76
atched
-0.73
essed
-0.72
acia
-0.72
inion
-0.71
ennes
-0.71
POSITIVE LOGITS
lihood
1.31
lier
0.97
ours
0.94
liest
0.87
hers
0.78
yours
0.75
minded
0.72
liness
0.69
minded
0.68
åĭ
0.67
Activations Density 0.068%