INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ي
    1.23
     for
    1.20
    1.17
    يلا
    1.17
    يها
    1.09
     meurt
    1.06
    يان
    1.05
     lanç
    1.03
     regrett
    1.03
     incend
    1.03
    POSITIVE LOGITS
    0
    1.72
     you
    1.36
    you
    1.20
    ill
    1.14
    1
    1.11
    ur
    1.10
    :
    1.10
    or
    1.05
     yourself
    1.05
    AH
    1.00
    Act Density 1.790%

    No Known Activations