INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     generated
    -0.07
     Interest
    -0.07
    _res
    -0.07
    	dest
    -0.07
    Diff
    -0.06
    Intent
    -0.06
    łu
    -0.06
    -0.06
    ustralia
    -0.06
    term
    -0.06
    POSITIVE LOGITS
    озд
    0.07
    	align
    0.07
     dom
    0.06
    <Form
    0.06
    :user
    0.06
    :YES
    0.06
     школи
    0.06
    inston
    0.06
     سازمان
    0.06
     وزن
    0.06
    Act Density 0.008%

    No Known Activations