INDEX
    Explanations

    negative or unfavorable sentiments expressed in the text

    New Auto-Interp
    Negative Logits
    ries
    -0.16
    osu
    -0.15
    akov
    -0.15
    ponsive
    -0.15
    asper
    -0.14
    eree
    -0.14
    irler
    -0.14
    heimer
    -0.14
    reira
    -0.14
    orbit
    -0.14
    POSITIVE LOGITS
    ly
    0.21
    iously
    0.19
    LY
    0.17
    ically
    0.16
    .ly
    0.16
    ely
    0.15
    brace
    0.15
    fully
    0.15
    uly
    0.14
    etheless
    0.14
    Act Density 0.917%

    No Known Activations