INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Drapeau
    -0.53
     tu
    -0.49
    res
    -0.48
    tivism
    -0.47
     you
    -0.47
     سكانية
    -0.47
     tâm
    -0.47
    mable
    -0.47
     me
    -0.47
    let
    -0.47
    POSITIVE LOGITS
    findpost
    0.78
     الرياضيه
    0.71
    ientôt
    0.65
    AndEndTag
    0.63
     barnen
    0.62
     utafitiHapana
    0.60
    الإنجليزية
    0.60
     berikutnya
    0.60
     skydd
    0.57
     skolan
    0.56
    Act Density 0.072%

    No Known Activations