INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ’ub
    -0.09
     Finals
    -0.09
    idered
    -0.08
     rook
    -0.08
     üzerinden
    -0.08
     Uniti
    -0.07
     Got
    -0.07
     inseg
    -0.07
    manship
    -0.07
     מקצוע
    -0.07
    POSITIVE LOGITS
    措施
    0.11
    努力
    0.09
     maatregelen
    0.09
    ਾਵ
    0.09
    deploy
    0.09
     deploy
    0.09
    maßnahmen
    0.09
    ាំង
    0.08
     шара
    0.08
     efforts
    0.08
    Act Density 0.005%

    No Known Activations