INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     effectués
    0.39
    দিন
    0.36
    h
    0.36
     Méd
    0.35
     bey
    0.35
     h
    0.35
     meditate
    0.35
    ч
    0.35
    行う
    0.35
    Author
    0.34
    POSITIVE LOGITS
     измене
    0.73
    changed
    0.68
     변경
    0.68
    Changed
    0.67
     CHANGES
    0.65
    Changes
    0.65
     changed
    0.64
     измени
    0.63
     perubahan
    0.63
     changes
    0.62
    Act Density 0.010%

    No Known Activations