INDEX
    Explanations

    instances of emotional and physical pain descriptors

    New Auto-Interp
    Negative Logits
    ÄĻd
    -0.18
    aim
    -0.16
    ivot
    -0.16
    ungan
    -0.15
    omed
    -0.15
    eren
    -0.15
    enek
    -0.15
    rien
    -0.15
    indeki
    -0.15
     Hüs
    -0.15
    POSITIVE LOGITS
     Torrent
    0.15
    yper
    0.15
     nett
    0.15
     alt
    0.14
    _CTL
    0.14
     Netz
    0.14
     Ott
    0.14
    alt
    0.14
    ZR
    0.14
    æº
    0.14
    Act Density 0.012%

    No Known Activations