INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ยนตร
    -0.07
     SAT
    -0.07
     facility
    -0.07
     physique
    -0.06
    ."),↵
    -0.06
     obviously
    -0.06
     sourcing
    -0.06
    oq
    -0.06
    اضر
    -0.06
    	J
    -0.06
    POSITIVE LOGITS
     이번
    0.07
     theological
    0.06
     Ihr
    0.06
    iştir
    0.06
     Compact
    0.06
    etched
    0.06
    AA
    0.06
    .Axis
    0.06
     costing
    0.06
    xaa
    0.06
    Act Density 0.000%

    No Known Activations