INDEX
    Explanations

    names of individuals

    words related to individual or group identities and social handles

    New Auto-Interp
    Negative Logits
     optics
    -0.67
    s
    -0.65
    mosp
    -0.64
    net
    -0.62
    deck
    -0.62
     shoulders
    -0.62
     Attribution
    -0.61
    ENTION
    -0.60
     accommodations
    -0.60
     Immunity
    -0.59
    POSITIVE LOGITS
    ppa
    1.44
    zzi
    1.37
    ppo
    1.35
    zzle
    1.34
    zza
    1.32
    pta
    1.28
    ppe
    1.26
    ÅŁ
    1.24
    jo
    1.20
    lda
    1.19
    Act Density 0.230%

    No Known Activations