INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    k
    1.16
    ك
    1.11
    w
    1.09
     for
    1.00
    p
    0.94
    ı
    0.94
    ک
    0.92
    v
    0.91
    ק
    0.88
     It
    0.85
    POSITIVE LOGITS
     conformity
    1.11
     conforms
    0.92
    >
    0.91
     سین
    0.89
    ຂອງ
    0.89
    ską
    0.89
    ться
    0.88
     conform
    0.87
    0.87
    ram
    0.85
    Act Density 0.003%

    No Known Activations