INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     IPC
    -0.06
    ROLE
    -0.06
    Too
    -0.06
    Twenty
    -0.06
    ="/"
    -0.05
     fibr
    -0.05
    spe
    -0.05
    -0.05
    ).↵↵↵↵
    -0.05
    	mode
    -0.05
    POSITIVE LOGITS
     بأ
    0.07
     "}\
    0.07
    اشت
    0.07
     resulted
    0.07
     Exactly
    0.07
     đâu
    0.07
    smith
    0.07
     presents
    0.06
     estamos
    0.06
    uen
    0.06
    Act Density 0.006%

    No Known Activations