INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Dani
    -0.07
    	from
    -0.07
     computers
    -0.06
    _cut
    -0.06
     transfer
    -0.06
    <u
    -0.06
     برگز
    -0.06
     مسائل
    -0.06
    abor
    -0.06
    txt
    -0.06
    POSITIVE LOGITS
     propos
    0.07
     Sentence
    0.07
    λλη
    0.06
     IMPORTANT
    0.06
    Rule
    0.06
    ��
    0.06
    важа
    0.06
    افية
    0.06
    Production
    0.06
    .Constraint
    0.06
    Act Density 0.002%

    No Known Activations