INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     اعلام
    -0.08
    .object
    -0.08
     ره
    -0.08
    .vars
    -0.08
     أمام
    -0.08
    forced
    -0.08
    ужден
    -0.08
    heter
    -0.08
    Presenter
    -0.08
     kız
    -0.07
    POSITIVE LOGITS
    函数
    0.13
     함수
    0.12
     returns
    0.11
     utility
    0.11
     функция
    0.10
     функцию
    0.10
     Funktion
    0.10
     retorna
    0.09
     Utility
    0.09
     Returns
    0.09
    Act Density 0.047%

    No Known Activations