INDEX
    Explanations

    references to political leaders and their titles

    New Auto-Interp
    Negative Logits
    wel
    -0.15
    velopment
    -0.15
    andr
    -0.15
    ICES
    -0.15
    ispers
    -0.15
    odu
    -0.14
    polator
    -0.14
    anship
    -0.14
    oooo
    -0.14
    pending
    -0.14
    POSITIVE LOGITS
     Emer
    0.17
    lij
    0.16
     Serif
    0.15
    Fast
    0.15
    azzi
    0.15
    innen
    0.15
     anxious
    0.14
    ÑĨем
    0.14
     fast
    0.14
    etag
    0.14
    Act Density 0.044%

    No Known Activations