INDEX
Explanations
phrases indicating positive emotions
expressions of feeling good or positive sentiments
New Auto-Interp
Negative Logits
distinguished
-0.65
favoured
-0.62
pioneered
-0.61
location
-0.60
erity
-0.59
disadvantage
-0.59
haps
-0.58
Lif
-0.58
coveted
-0.58
hip
-0.57
POSITIVE LOGITS
stories
0.88
ãĤ´
0.85
ALLY
0.73
lapt
0.70
:)
0.66
andi
0.66
waves
0.65
puff
0.64
enough
0.62
locks
0.61
Activations Density 0.062%