INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ا
    2.94
    י
    2.61
    ي
    2.27
    ק
    2.23
    ्स
    2.20
    ە
    1.86
    IE
    1.84
    ے
    1.84
    1.83
    ی
    1.81
    POSITIVE LOGITS
    2.08
    ant
    2.00
     gehört
    2.00
    lime
    1.92
    1.92
    et
    1.90
     mögliche
    1.90
     antérieure
    1.87
    gi
    1.86
    gt
    1.86
    Act Density 0.005%

    No Known Activations