INDEX
    Explanations

    references to "nation" or its variations, signaling a focus on national topics or issues

    New Auto-Interp
    Negative Logits
    orie
    -0.17
    sse
    -0.17
    ly
    -0.17
    lyn
    -0.17
    ors
    -0.16
    ory
    -0.15
    dater
    -0.15
    ÑģÑı
    -0.15
    leaf
    -0.15
    orer
    -0.14
    POSITIVE LOGITS
    wide
    0.28
    hood
    0.25
    nal
    0.23
    -wide
    0.22
    alse
    0.22
    ally
    0.22
    istic
    0.21
    /world
    0.21
    -states
    0.19
    ality
    0.19
    Act Density 0.015%

    No Known Activations