INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    nya
    3.84
    م
    3.60
    ly
    3.58
    3.58
    க்
    3.47
    ्स
    3.42
    n
    3.32
    3.26
    nin
    3.23
    k
    3.18
    POSITIVE LOGITS
    ɖ
    2.79
    fr
    2.33
    ños
    2.31
    ș
    2.28
    ñ
    2.25
    ً
    2.21
    ña
    2.21
    2.19
    ße
    2.18
    ulkner
    2.07
    Act Density 0.309%

    No Known Activations