INDEX
Explanations
strong verbs or nouns ending in '-ing'
words that suggest strength or robustness
New Auto-Interp
Negative Logits
drum
-0.72
stink
-0.71
spin
-0.70
dust
-0.69
fever
-0.68
rift
-0.67
fuss
-0.66
sweat
-0.65
spin
-0.64
trump
-0.63
POSITIVE LOGITS
itionally
1.06
ations
1.06
ivably
1.04
ational
1.01
atic
0.99
itably
0.98
ATIONS
0.96
ances
0.96
atically
0.93
aci
0.93
Activations Density 0.155%