INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Στα
    -0.07
     };
    -0.07
    atoon
    -0.07
     Таким
    -0.07
     kiş
    -0.06
    -0.06
    ّت
    -0.06
    971
    -0.06
     courageous
    -0.06
    ikt
    -0.06
    POSITIVE LOGITS
     problem
    0.09
     Problem
    0.07
    Problem
    0.07
    (":
    0.07
     Problems
    0.06
    0.06
     overflow
    0.06
    mma
    0.06
     offenses
    0.06
    ulum
    0.06
    Act Density 0.015%

    No Known Activations