INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    in
    1.18
    to
    1.13
    i
    1.08
     lão
    1.03
    u
    1.02
    ي
    0.99
    0.95
    ۔
    0.95
     in
    0.94
     في
    0.93
    POSITIVE LOGITS
    1.16
     phone
    1.10
     synthes
    1.06
    ある
    1.05
    Phone
    0.98
     reorgan
    0.97
    ли
    0.96
     you
    0.95
    ला
    0.94
    ου
    0.94
    Act Density 0.015%

    No Known Activations