INDEX
    Explanations

    intensifiers or adjectives that convey high emphasis

    New Auto-Interp
    Negative Logits
    acades
    -0.17
    traction
    -0.15
    ans
    -0.14
    olis
    -0.14
    lore
    -0.14
    unas
    -0.14
    ©
    -0.13
    aho
    -0.13
    eras
    -0.13
    ver
    -0.13
    POSITIVE LOGITS
    iesen
    0.17
    igt
    0.17
    bout
    0.15
    ìķ
    0.15
    ÏĨα
    0.14
    _RUN
    0.14
    AZE
    0.14
    usta
    0.14
    ĮĢ
    0.14
    ambia
    0.13
    Act Density 0.101%

    No Known Activations