INDEX
Explanations
words related to enthusiasm and emphasis
New Auto-Interp
Negative Logits
antry
-0.77
tein
-0.68
icipated
-0.68
illary
-0.67
ifully
-0.66
RY
-0.64
Purchase
-0.63
ruary
-0.63
arian
-0.62
arthed
-0.62
POSITIVE LOGITS
FTWARE
0.78
darn
0.76
appreciated
0.75
liked
0.75
messed
0.74
pissed
0.72
ignment
0.71
polit
0.70
bother
0.70
appreciate
0.69
Activations Density 0.286%