INDEX
    Explanations

    names of individuals, particularly those associated with popular culture or media

    New Auto-Interp
    Negative Logits
    su
    -0.19
    st
    -0.19
    pay
    -0.17
    nit
    -0.17
    elli
    -0.17
    lex
    -0.17
    speed
    -0.16
    sy
    -0.16
    ned
    -0.16
    sch
    -0.16
    POSITIVE LOGITS
    yyyy
    0.22
    eva
    0.22
    ean
    0.21
    yyy
    0.20
    lic
    0.20
    lation
    0.20
    ville
    0.20
    mania
    0.19
    tics
    0.18
    ahoo
    0.18
    Act Density 0.052%

    No Known Activations