INDEX
    Explanations

    expressions related to emotional experiences and responses

    New Auto-Interp
    Negative Logits
    avou
    -0.16
    acket
    -0.15
    ewire
    -0.15
    ffen
    -0.15
    stp
    -0.14
    GBK
    -0.14
    aticon
    -0.14
    olare
    -0.14
    ãĥ³ãĤ¸
    -0.14
    aler
    -0.13
    POSITIVE LOGITS
    ham
    0.16
    420
    0.16
    felt
    0.15
    aigned
    0.14
    urn
    0.14
    aneous
    0.14
    627
    0.14
    Ñıж
    0.14
    693
    0.14
    661
    0.14
    Act Density 0.042%

    No Known Activations