INDEX
    Explanations

    references to individuals in political contexts

    New Auto-Interp
    Negative Logits
    itsu
    -0.18
    elsey
    -0.17
     Canter
    -0.17
    çª
    -0.17
    itter
    -0.17
    ero
    -0.17
    itch
    -0.17
    tsky
    -0.16
    ahun
    -0.16
    utr
    -0.16
    POSITIVE LOGITS
    tin
    0.19
    aryana
    0.18
    asm
    0.17
    ema
    0.17
    iral
    0.17
    UDA
    0.17
    iss
    0.17
    iran
    0.17
    uda
    0.16
    iday
    0.16
    Act Density 0.027%

    No Known Activations