INDEX
    Explanations

    instances of emotions or expressions of feelings

    New Auto-Interp
    Negative Logits
    arta
    -0.15
    allis
    -0.15
    undan
    -0.14
    _argument
    -0.14
     punch
    -0.14
    ipes
    -0.14
     lur
    -0.14
    arts
    -0.14
     punched
    -0.14
    LS
    -0.14
    POSITIVE LOGITS
     Perception
    0.16
    iens
    0.16
    isseur
    0.14
    OfFile
    0.14
    enty
    0.14
     æ¥Ń
    0.14
    edin
    0.14
    odash
    0.14
    .viewer
    0.14
     FOX
    0.14
    Act Density 0.045%

    No Known Activations