INDEX
    Explanations

    references to nuclear weapons and related incidents

    New Auto-Interp
    Negative Logits
     band
    -0.16
     newItem
    -0.15
    zy
    -0.14
    olie
    -0.14
     jb
    -0.14
    obe
    -0.14
     Grim
    -0.14
    jong
    -0.13
    band
    -0.13
     affection
    -0.13
    POSITIVE LOGITS
    ovaly
    0.14
    aversable
    0.14
    ëģ¼
    0.14
    ssp
    0.14
    -transitional
    0.14
    errat
    0.14
    parity
    0.14
    pollo
    0.13
    empo
    0.13
    oky
    0.13
    Act Density 0.022%

    No Known Activations