INDEX
    Explanations

    researchers and developers

    New Auto-Interp
    Negative Logits
    at
    1.34
    to
    1.17
    ری
    1.17
    1.15
    ش
    1.12
    ع
    1.11
    t
    1.09
    as
    1.07
    ت
    1.07
    س
    1.04
    POSITIVE LOGITS
    ’”
    0.86
    lerce
    0.81
    li
    0.76
     professionnels
    0.73
    liğini
    0.72
    preneurs
    0.71
    0.70
     politici
    0.69
     الذين
    0.68
     мощности
    0.68
    Act Density 0.918%

    No Known Activations