INDEX
Explanations
phrases related to deserving or worthy individuals or actions
phrases that express worthiness or deserving of recognition and praise
New Auto-Interp
Negative Logits
nel
-0.74
ullivan
-0.74
nels
-0.72
lems
-0.65
barriers
-0.65
stru
-0.64
isms
-0.63
weeds
-0.63
processes
-0.62
portals
-0.62
POSITIVE LOGITS
scorn
0.91
praise
0.88
$$$$
0.86
cellence
0.79
ridicule
0.77
EDIT
0.77
ENTION
0.76
criticism
0.74
udos
0.74
scrutiny
0.72
Activations Density 0.105%