INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    akaran
    0.44
    𝘫
    0.44
    ர்ம
    0.42
    masının
    0.42
    ಂದು
    0.42
    情報の
    0.42
    布置
    0.41
     motives
    0.41
    ಂಭ
    0.41
    wab
    0.41
    POSITIVE LOGITS
     دين
    0.57
     head
    0.53
     lát
    0.52
     ذلك
    0.49
     Най
    0.48
     Head
    0.48
     رأس
    0.47
    Teacher
    0.47
     شي
    0.46
    LE
    0.46
    Act Density 0.005%

    No Known Activations