INDEX
    Explanations

    references to nations, particularly in the context of government or geopolitical topics

    New Auto-Interp
    Negative Logits
     Hra
    -0.17
    ved
    -0.16
    éŃļ
    -0.15
    yon
    -0.15
     Güven
    -0.15
    aru
    -0.14
    ivet
    -0.14
     seg
    -0.14
    idas
    -0.14
    çŃĴ
    -0.14
    POSITIVE LOGITS
    enda
    0.17
    hod
    0.15
    abox
    0.15
    anoia
    0.14
     Fog
    0.14
    inth
    0.14
    çĽ
    0.14
    eyh
    0.14
    eper
    0.14
     diss
    0.14
    Act Density 0.003%

    No Known Activations