INDEX
Explanations
phrases indicating a forceful action or a significant effort
phrases that include the term "pull out"
New Auto-Interp
Negative Logits
ould
-0.77
llah
-0.70
esa
-0.68
orld
-0.66
umbered
-0.66
hedon
-0.64
eli
-0.64
pton
-0.61
enegger
-0.60
Queue
-0.60
POSITIVE LOGITS
stretched
0.83
wards
0.77
itives
0.72
microphones
0.72
levers
0.71
roots
0.67
Snap
0.64
Medline
0.63
weeds
0.63
loopholes
0.61
Activations Density 0.035%