INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    𝘯
    2.89
    𝘭
    2.78
    𝐥
    2.70
    ্লি
    2.66
    𝐧
    2.61
    izarea
    2.53
    2.51
    ização
    2.50
    𝘤
    2.46
    2.46
    POSITIVE LOGITS
    s
    2.93
    으로
    2.42
    h
    2.27
    «
    2.25
    ات
    2.20
    sman
    2.19
    ي
    2.18
    ের
    2.15
    ing
    2.12
    t
    2.12
    Act Density 0.437%

    No Known Activations