INDEX
    Explanations

    expressions related to emotional states and sensations

    New Auto-Interp
    Negative Logits
    ../../../
    -0.17
    itals
    -0.15
    esse
    -0.15
    unidad
    -0.15
    doi
    -0.14
    esian
    -0.14
    ophone
    -0.14
    mun
    -0.14
    appe
    -0.14
    ello
    -0.14
    POSITIVE LOGITS
    -good
    0.27
    ings
    0.24
     sorry
    0.22
     Sorry
    0.19
    lessly
    0.19
    sorry
    0.18
    inspace
    0.18
    -safe
    0.17
    INGS
    0.17
    good
    0.17
    Act Density 0.041%

    No Known Activations