INDEX
    Explanations

    references to emotional states and psychological concepts

    New Auto-Interp
    Negative Logits
    iman
    -0.17
    ogle
    -0.15
    afa
    -0.14
    etÃŃ
    -0.14
    uchs
    -0.14
    enz
    -0.14
    imen
    -0.14
    ëĭĪìĬ¤
    -0.14
    šem
    -0.14
    ingers
    -0.13
    POSITIVE LOGITS
    μÏĢο
    0.17
    bour
    0.15
     exclus
    0.14
    Ïģιν
    0.14
    might
    0.14
     bunu
    0.14
    Ñģли
    0.14
    haps
    0.13
     responsible
    0.13
    483
    0.13
    Act Density 0.003%

    No Known Activations