INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     κάθε
    -0.08
    HW
    -0.08
    _begin
    -0.07
     quello
    -0.07
     שימוש
    -0.07
    _inches
    -0.07
     الشرطة
    -0.07
    -0.07
     esforço
    -0.07
    NASDAQ
    -0.07
    POSITIVE LOGITS
    ethu
    0.08
     Cub
    0.08
     Mohammad
    0.07
    ongan
    0.07
    ogli
    0.07
     waves
    0.07
    ungua
    0.07
     Sint
    0.07
     whims
    0.07
     Muhammad
    0.07
    Act Density 0.145%

    No Known Activations