INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.14
    ي
    1.07
    v
    0.92
    0.89
    ية
    0.86
    يت
    0.86
    د
    0.84
    os
    0.83
    ك
    0.82
    op
    0.81
    POSITIVE LOGITS
     Второй
    0.98
     USAF
    0.85
     Į
    0.84
    ંમે
    0.82
     x
    0.80
     В
    0.77
    0.77
     Trigon
    0.77
     chord
    0.77
     cok
    0.76
    Act Density 0.001%

    No Known Activations