INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     وعلى
    2.06
    sning
    1.87
     quidem
    1.63
    𝐦
    1.53
    mathrm
    1.51
    ki
    1.48
    s
    1.45
     avian
    1.44
     kube
    1.41
     Hotspur
    1.40
    POSITIVE LOGITS
    ى
    2.31
    ed
    2.23
    يد
    2.00
    ف
    1.85
    ت
    1.80
    ع
    1.80
    ab
    1.77
    را
    1.74
    ير
    1.74
    یا
    1.72
    Act Density 0.003%

    No Known Activations