INDEX
    Explanations

    expressions of emotion or non-verbal communication

    expressions of emotions or reactions

    New Auto-Interp
    Negative Logits
    arthed
    -0.84
    artifacts
    -0.76
    OTOS
    -0.76
    icipated
    -0.75
    sites
    -0.74
    senal
    -0.73
    ategory
    -0.72
    irements
    -0.70
    inction
    -0.70
     conting
    -0.70
    POSITIVE LOGITS
     grin
    1.10
     smile
    0.95
     grinning
    0.93
     impatient
    0.92
     smiling
    0.92
     roar
    0.90
     gigg
    0.88
     angrily
    0.87
     exclaim
    0.87
     smug
    0.86
    Act Density 0.370%

    No Known Activations