INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     reaching
    -0.08
     jal
    -0.08
     llegado
    -0.08
     cheg
    -0.08
     reaches
    -0.08
    Cry
    -0.08
     raf
    -0.07
     활용
    -0.07
     reached
    -0.07
     třeba
    -0.07
    POSITIVE LOGITS
     запрещ
    0.10
    0.10
    は禁止
    0.10
     соблюдать
    0.09
     বাধ
    0.09
     prohibited
    0.09
     মতে
    0.09
     Einschr
    0.09
     Restrictions
    0.08
    0.08
    Act Density 0.002%

    No Known Activations