INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	left
    -0.07
    _decoder
    -0.07
    :id
    -0.07
    -less
    -0.07
     onSelect
    -0.07
     Ibrahim
    -0.06
    สาว
    -0.06
    .Id
    -0.06
     left
    -0.06
    (dd
    -0.06
    POSITIVE LOGITS
     awareness
    0.10
     Awareness
    0.09
    面积
    0.07
    वर
    0.06
    mín
    0.06
     Outreach
    0.06
     indifference
    0.06
     رای
    0.06
    emperature
    0.06
    olerance
    0.06
    Act Density 0.009%

    No Known Activations