INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ка
    0.93
    ли
    0.91
    ва
    0.90
    ча
    0.89
    ри
    0.82
    ومي
    0.78
    ى
    0.77
    ность
    0.75
    ное
    0.75
    ру
    0.73
    POSITIVE LOGITS
    ↵↵
    0.97
    H
    0.96
    as
    0.89
    A
    0.84
    K
    0.82
     for
    0.81
    P
    0.81
    ad
    0.77
     subject
    0.77
     speculation
    0.73
    Act Density 0.009%

    No Known Activations