INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kháng
    -0.07
    *"
    -0.06
    '):
    -0.06
    PlainText
    -0.06
    ايات
    -0.06
    になって
    -0.06
    아파트
    -0.06
    จะต
    -0.06
     James
    -0.06
     stress
    -0.06
    POSITIVE LOGITS
     few
    0.11
     Few
    0.10
     fewer
    0.09
     other
    0.08
    .food
    0.08
    Few
    0.08
     new
    0.07
    NUM
    0.07
     nib
    0.07
     Genç
    0.07
    Act Density 0.014%

    No Known Activations