INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     dividir
    -0.08
    ঠিক
    -0.07
    -0.07
    wap
    -0.07
     infiltration
    -0.07
    ,int
    -0.07
    暂停
    -0.07
     particulière
    -0.07
    -0.07
    -0.07
    POSITIVE LOGITS
     originality
    0.14
     уник
    0.13
    原创
    0.12
     novel
    0.11
     придум
    0.11
     Novel
    0.11
     uniqueness
    0.11
     nouve
    0.10
     innovate
    0.10
     özg
    0.10
    Act Density 0.082%

    No Known Activations