INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     garçons
    -0.08
     cães
    -0.08
     flattering
    -0.08
     زور
    -0.08
    -0.07
     הכול
    -0.07
     chiens
    -0.07
    重点
    -0.07
     המד
    -0.07
     cão
    -0.07
    POSITIVE LOGITS
    ელი
    0.08
     Homework
    0.08
     Classified
    0.08
    -Date
    0.07
     meng
    0.07
    обходимо
    0.07
    -H
    0.07
    .Hex
    0.07
     norma
    0.07
     Besch
    0.07
    Act Density 0.003%

    No Known Activations