INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    k
    0.39
    w
    0.35
    on
    0.34
    ان
    0.33
     on
    0.33
    ik
    0.31
    எஸ்
    0.31
    c
    0.31
     história
    0.30
    0.30
    POSITIVE LOGITS
     a
    0.32
    다는
    0.31
    ס
    0.29
     바탕
    0.28
     morts
    0.28
    𝙮
    0.28
    يد
    0.28
    𝙢
    0.27
     Isles
    0.27
    𝚐
    0.27
    Act Density 0.346%

    No Known Activations