INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    op
    1.05
     perchè
    1.04
     i
    0.98
     y
    0.98
    ️⃣
    0.96
     stata
    0.96
    0.95
     M
    0.95
     l
    0.94
     e
    0.94
    POSITIVE LOGITS
    يت
    1.23
    1.10
    گ
    1.09
    1.05
    ва
    1.01
    де
    1.00
    ма
    0.97
    ترة
    0.97
    velles
    0.95
    newtheorem
    0.95
    Act Density 0.003%

    No Known Activations