INDEX
Explanations
positive emotions or appreciative expressions
expressions of praise, appreciation, and lamentation in the context of social and cultural commentary
New Auto-Interp
Negative Logits
oath
-0.74
frog
-0.69
violet
-0.67
shroud
-0.66
beads
-0.66
ribbon
-0.65
assassin
-0.65
looms
-0.64
hairs
-0.64
oats
-0.64
POSITIVE LOGITS
ably
1.64
ational
1.32
ationally
1.27
ations
1.22
atory
1.20
ability
1.17
ating
1.16
able
1.11
antly
1.10
ately
1.10
Activations Density 0.039%