INDEX
    Explanations

    references to sanctions and their implications

    New Auto-Interp
    Negative Logits
    hai
    -0.15
    afone
    -0.14
     discharged
    -0.14
    squ
    -0.14
    éré
    -0.14
    alo
    -0.14
    509
    -0.14
    วà¸Ļ
    -0.13
    hausen
    -0.13
    rise
    -0.13
    POSITIVE LOGITS
     sanctions
    0.46
    san
    0.41
     sanction
    0.40
     San
    0.38
     san
    0.34
    San
    0.33
    _san
    0.33
    -san
    0.31
     SAN
    0.30
     sanctioned
    0.28
    Act Density 0.079%

    No Known Activations