INDEX
    Explanations

    expressions of emotional reactions or sentiments

    New Auto-Interp
    Negative Logits
    ICODE
    -0.15
    yre
    -0.15
    lite
    -0.15
    uš
    -0.14
    hya
    -0.13
     overall
    -0.13
    ocop
    -0.13
    e
    -0.13
     thresholds
    -0.13
    ogenerated
    -0.13
    POSITIVE LOGITS
    'gc
    0.15
    ovit
    0.15
    gross
    0.15
    678
    0.15
    idden
    0.14
    erville
    0.14
     eoq
    0.13
    ket
    0.13
    ynet
    0.13
    .ta
    0.13
    Act Density 0.042%

    No Known Activations