INDEX
    Explanations

    adjectives that express strong emotions like surprise or concern

    terms expressing surprise or concern

    New Auto-Interp
    Negative Logits
    resp
    -0.72
    agine
    -0.66
    elf
    -0.65
    ournal
    -0.65
    vous
    -0.65
    ighth
    -0.61
    aper
    -0.60
    cipl
    -0.59
    bey
    -0.59
    ravings
    -0.58
    POSITIVE LOGITS
     enough
    1.07
    LY
    0.98
     nonetheless
    0.93
    ly
    0.86
    ingly
    0.82
     because
    0.77
     considering
    0.75
     insofar
    0.75
     JPM
    0.73
    200000
    0.69
    Act Density 0.184%

    No Known Activations