INDEX
Explanations
phrases related to successful or impactful actions
occurrences of the verb "make" and its variations
New Auto-Interp
Negative Logits
ãĥ´
-0.88
hood
-0.71
ãĥİ
-0.71
Ples
-0.70
tg
-0.68
æ©Ł
-0.66
PLIED
-0.65
thouse
-0.64
andowski
-0.64
pour
-0.64
POSITIVE LOGITS
strides
1.25
mistakes
1.01
adjustments
1.00
sacrifices
0.96
plays
0.89
sure
0.89
excuses
0.88
saves
0.84
noises
0.82
hift
0.81
Activations Density 0.073%