INDEX
    Explanations

    references to the concept of "nation" and its variations

    New Auto-Interp
    Negative Logits
    s
    -0.18
    nice
    -0.16
    orie
    -0.15
    ive
    -0.15
    sse
    -0.15
    otty
    -0.15
    avian
    -0.15
    out
    -0.15
    sko
    -0.14
    oke
    -0.14
    POSITIVE LOGITS
    hood
    0.40
    wide
    0.34
    -wide
    0.30
    -state
    0.29
    ally
    0.28
    alse
    0.28
    -states
    0.28
    aal
    0.26
    /world
    0.26
    nal
    0.26
    Act Density 0.017%

    No Known Activations