INDEX
    Explanations

    mathematical expressions

    New Auto-Interp
    Negative Logits
    _health
    -0.07
     laid
    -0.07
    	cat
    -0.07
     ตร
    -0.07
     lvl
    -0.06
    (area
    -0.06
     constr
    -0.06
    -0.06
     harming
    -0.06
    _parts
    -0.06
    POSITIVE LOGITS
    MPI
    0.08
     müşteri
    0.08
     certif
    0.07
    ıyoruz
    0.07
    judul
    0.06
    уется
    0.06
     jew
    0.06
     Mustang
    0.06
     Demir
    0.06
     تهیه
    0.06
    Act Density 0.003%

    No Known Activations