INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ب
    4.25
    بك
    2.75
    ர்
    2.50
    ри
    2.47
    2.41
    tes
    2.22
    م
    2.22
    ك
    2.22
    2.19
    פּ
    2.17
    POSITIVE LOGITS
    elt
    2.69
     וכ
    2.44
    are
    2.36
    ata
    2.36
    ato
    2.25
    ela
    2.13
    asa
    2.13
    ian
    2.05
    ats
    2.03
    ent
    2.00
    Act Density 0.098%

    No Known Activations