INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    1.25
    ،
    1.15
    you
    1.02
    0.99
    0.93
     ہے۔
    0.91
    人々
    0.87
    ある
    0.84
     you
    0.84
     are
    0.82
    POSITIVE LOGITS
    il
    1.13
    1.09
    r
    1.06
    and
    1.04
     Variables
    1.03
     Variable
    1.02
     and
    1.00
    ing
    0.99
    Z
    0.98
    ah
    0.90
    Act Density 0.023%

    No Known Activations