INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     f
    1.03
     Kate
    1.00
     کړئ
    0.99
     g
    0.98
     υ
    0.96
     conscience
    0.95
    atamente
    0.95
    atrice
    0.93
     ש
    0.92
    ɱ
    0.92
    POSITIVE LOGITS
    time
    1.40
    <unused2222>
    1.38
    sigh
    1.32
    dalam
    1.27
    kker
    1.27
    1.25
     शासन
    1.24
    1.24
    ritical
    1.23
    dough
    1.21
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.