INDEX
    Explanations

    countries, political figures, and government-related terms

    proper nouns related to geopolitical issues and countries

    New Auto-Interp
    Negative Logits
    mble
    -0.62
     Niet
    -0.61
    indu
    -0.52
     Hiroshima
    -0.50
    yip
    -0.49
    RW
    -0.49
    pires
    -0.47
    }"
    -0.46
    veyard
    -0.46
    tumblr
    -0.46
    POSITIVE LOGITS
    's
    0.95
     coffers
    0.68
    ÃŃs
    0.65
     because
    0.59
     amid
    0.58
     whereby
    0.57
    Care
    0.56
     throughout
    0.56
     through
    0.54
     lately
    0.54
    Act Density 0.471%

    No Known Activations