INDEX
Explanations
mentions of the word "Sad" with varying levels of emphasis
New Auto-Interp
Negative Logits
sterling
-0.69
FANT
-0.66
platinum
-0.63
å§«
-0.63
unde
-0.63
Reloaded
-0.62
pegged
-0.61
bottled
-0.61
Kendrick
-0.60
helic
-0.59
POSITIVE LOGITS
omas
1.20
istic
1.04
hus
1.01
emic
1.00
amoto
0.98
hya
0.98
du
0.96
ness
0.95
ako
0.95
acio
0.92
Activations Density 0.039%