INDEX
    Explanations

    aryl, hydroxyl, bylaws, Xylos, pylint

    New Auto-Interp
    Negative Logits
    i
    1.79
    f
    1.50
    an
    1.42
    ية
    1.39
    al
    1.38
    í
    1.32
    z
    1.30
    quela
    1.26
     (
    1.25
    er
    1.24
    POSITIVE LOGITS
    ک
    1.58
    ב
    1.45
    ج
    1.41
    ت
    1.41
    不斷
    1.39
    س
    1.39
    ب
    1.37
    ص
    1.36
    𝙨
    1.31
    1.30
    Act Density 0.062%

    No Known Activations