INDEX
    Explanations

    instances of strong sentiments or emotional expressions

    New Auto-Interp
    Negative Logits
    a
    -0.15
    lag
    -0.15
    627
    -0.15
    "
    -0.15
    iu
    -0.15
    o
    -0.14
    oph
    -0.14
    l
    -0.14
     stand
    -0.14
     scan
    -0.14
    POSITIVE LOGITS
    ÐĶÐļ
    0.15
    drm
    0.15
    oreach
    0.15
    bette
    0.15
    grav
    0.15
    .byId
    0.14
    __$
    0.14
     porr
    0.14
    mtx
    0.14
    ripe
    0.14
    Act Density 0.038%

    No Known Activations