INDEX
    Explanations

    names of countries or regions

    New Auto-Interp
    Negative Logits
    gerald
    -0.82
    igue
    -0.80
    MENTS
    -0.80
    lli
    -0.80
    llo
    -0.76
    eus
    -0.74
    MENT
    -0.74
    igor
    -0.73
    ashtra
    -0.71
    lla
    -0.70
    POSITIVE LOGITS
     ì
    1.02
     Jong
    1.00
     ë
    1.00
    ongyang
    0.93
     Kardashian
    0.92
    orea
    0.86
    ë
    0.86
    stress
    0.85
     Korea
    0.84
     peninsula
    0.83
    Act Density 0.858%

    No Known Activations