INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     amic
    -0.09
     parted
    -0.08
     plomb
    -0.08
    Lic
    -0.07
     सौ
    -0.07
    эконом
    -0.07
     Umb
    -0.07
     mener
    -0.07
    -0.07
     divisor
    -0.07
    POSITIVE LOGITS
    neur
    0.09
     khả
    0.08
     cow
    0.08
    .activate
    0.08
    0.08
     intr
    0.08
     posture
    0.08
     capacités
    0.08
     activate
    0.08
    depth
    0.08
    Act Density 0.006%

    No Known Activations