INDEX
    Explanations

    code keywords and punctuation

    New Auto-Interp
    Negative Logits
    ,
    0.88
    0.71
     
    0.70
    কে
    0.69
    pathy
    0.67
    speople
    0.65
    tble
    0.63
     an
    0.63
     (\<
    0.63
    jaa
    0.63
    POSITIVE LOGITS
    and
    1.11
    و
    1.10
    ه
    0.98
     for
    0.95
     and
    0.91
    ik
    0.91
    1
    0.88
    на
    0.87
    ни
    0.84
    P
    0.83
    Act Density 0.122%

    No Known Activations