INDEX
Explanations
descriptions of physical actions and dialogues
New Auto-Interp
Negative Logits
ancial
-0.76
Reviewer
-0.73
Sources
-0.72
conservancy
-0.72
NW
-0.71
national
-0.71
stunts
-0.70
nominees
-0.70
Critics
-0.69
regate
-0.68
POSITIVE LOGITS
grin
1.22
nodded
1.19
smir
1.17
grinned
1.13
smiled
1.12
frown
1.12
grinning
1.12
murm
1.10
nods
1.09
smile
1.08
Activations Density 1.796%