INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ב
    0.71
    L
    0.67
    ق
    0.64
    N
    0.59
    ک
    0.59
    RE
    0.57
    is
    0.57
    ب
    0.57
    ות
    0.56
    ف
    0.53
    POSITIVE LOGITS
    ení
    0.39
     precession
    0.39
    erné
    0.38
     неодно
    0.38
     significative
    0.37
     vigilant
    0.37
     napkin
    0.36
    enn
    0.36
     посредством
    0.36
     మరో
    0.36
    Act Density 0.241%

    No Known Activations