INDEX
    Explanations

    references to government, military, and health-related topics

    New Auto-Interp
    Negative Logits
    abbit
    -0.20
     
    -0.19
    adic
    -0.18
    adena
    -0.18
    aden
    -0.17
    alc
    -0.17
    abb
    -0.16
    agli
    -0.16
    wheel
    -0.16
    αλ
    -0.16
    POSITIVE LOGITS
    them
    0.18
     ihnen
    0.17
    they
    0.17
    ayd
    0.17
     them
    0.17
     Ay
    0.17
     Barney
    0.16
     Baz
    0.16
     avant
    0.16
    ayo
    0.16
    Act Density 0.052%

    No Known Activations