INDEX
    Explanations

    corruption charges and scandals

    New Auto-Interp
    Negative Logits
    م
    1.88
    ן
    1.50
    ם
    1.32
    ش
    1.31
    ج
    1.27
    ம்
    1.26
    на
    1.24
    й
    1.21
    ння
    1.16
    it
    1.15
    POSITIVE LOGITS
    t
    1.67
    3
    1.47
    ti
    1.33
    r
    1.33
    li
    1.27
    h
    1.23
    2
    1.20
    v
    1.19
    ty
    1.08
    token
    1.03
    Act Density 0.001%

    No Known Activations