INDEX
    Explanations

    expressions of strong emotions and personal sentiments

    New Auto-Interp
    Negative Logits
    440
    -0.16
    OnInit
    -0.15
    453
    -0.15
    ocab
    -0.15
    vid
    -0.14
    avel
    -0.14
    vida
    -0.14
    *>*
    -0.14
     normals
    -0.14
    653
    -0.14
    POSITIVE LOGITS
    upe
    0.17
    inge
    0.16
    SSF
    0.15
    ader
    0.14
    ì¿ł
    0.14
    èĩ£
    0.14
     sincerely
    0.13
    کس
    0.13
     Rex
    0.13
    nage
    0.13
    Act Density 0.118%

    No Known Activations