INDEX
    Explanations

    emotional expressions or reactions

    New Auto-Interp
    Negative Logits
     = 
    -0.67
    Â
    -0.65
     Â
    -0.64
     â
    -0.63
    Ã
    -0.63
    apatalk
    -0.62
     ${\
    -0.61
     ‪
    -0.58
    ${\
    -0.57
    -0.57
    POSITIVE LOGITS
     🥺
    0.88
     abt
    0.84
     ✨
    0.77
     😭😭
    0.75
    🥺
    0.75
     😭
    0.73
     🥲
    0.73
    ,,,
    0.73
     👀
    0.73
     💀
    0.71
    Act Density 0.193%

    No Known Activations