INDEX
    Explanations

    negative connotations or mentions of decline

    New Auto-Interp
    Negative Logits
    377
    -0.15
    nore
    -0.14
    360
    -0.14
     Strauss
    -0.14
    urette
    -0.14
    865
    -0.14
    533
    -0.14
    иж
    -0.14
     licensors
    -0.14
    tar
    -0.14
    POSITIVE LOGITS
    .sponge
    0.17
    ufen
    0.16
    (#)
    0.16
    ë©ĺ
    0.15
    ingen
    0.14
    jen
    0.14
     cred
    0.14
    ewater
    0.14
    iment
    0.14
    ç»ĵ
    0.14
    Act Density 0.018%

    No Known Activations