INDEX
    Explanations

    mentions of Russia and related entities

    New Auto-Interp
    Negative Logits
    ãģ²
    -0.16
    ifr
    -0.16
    bos
    -0.15
    iswa
    -0.15
    asco
    -0.15
    orns
    -0.14
    TER
    -0.14
    ibase
    -0.14
    ivor
    -0.14
    -long
    -0.13
    POSITIVE LOGITS
     Federation
    0.29
     Roulette
    0.23
    ìĭľìķĦ
    0.20
    ç½Ĺæĸ¯
    0.20
     roulette
    0.18
     Фед
    0.18
     Dmit
    0.17
     Fed
    0.17
     Feder
    0.16
    Fed
    0.16
    Act Density 0.023%

    No Known Activations