INDEX
    Explanations

    descriptions of physical actions and dialogues

    New Auto-Interp
    Negative Logits
    ancial
    -0.76
    Reviewer
    -0.73
    Sources
    -0.72
    conservancy
    -0.72
    NW
    -0.71
    national
    -0.71
     stunts
    -0.70
     nominees
    -0.70
     Critics
    -0.69
    regate
    -0.68
    POSITIVE LOGITS
     grin
    1.22
     nodded
    1.19
     smir
    1.17
     grinned
    1.13
     smiled
    1.12
     frown
    1.12
     grinning
    1.12
     murm
    1.10
     nods
    1.09
     smile
    1.08
    Act Density 1.796%

    No Known Activations