INDEX
    Explanations

    words related to acceptance and agreeing to conditions or ideas

    New Auto-Interp
    Negative Logits
    pper
    -0.18
    alnız
    -0.16
    esp
    -0.15
    exact
    -0.15
    opathy
    -0.14
    ults
    -0.14
    cape
    -0.14
    CAPE
    -0.14
    éĩı
    -0.14
    ak
    -0.14
    POSITIVE LOGITS
    ably
    0.32
    ance
    0.30
    ances
    0.25
    ANCE
    0.23
    ively
    0.22
    reject
    0.19
     responsibility
    0.18
    eer
    0.18
    ive
    0.17
     ÑĥÑĩаÑģÑĤие
    0.17
    Act Density 0.037%

    No Known Activations