INDEX
    Explanations

    phrases indicating feelings of exclusion or inadequacy

    New Auto-Interp
    Negative Logits
    duk
    -0.15
    agens
    -0.15
    iams
    -0.15
    cak
    -0.15
    inus
    -0.14
    icari
    -0.14
     Kostenlose
    -0.14
    umd
    -0.14
    íĻį
    -0.14
    reeze
    -0.14
    POSITIVE LOGITS
     even
    0.16
    RING
    0.16
    even
    0.15
    ountain
    0.14
     Hick
    0.14
     sometimes
    0.14
     Neu
    0.14
     cav
    0.14
     cred
    0.13
    -basket
    0.13
    Act Density 0.037%

    No Known Activations