INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lera
    -0.08
     antis
    -0.07
     check
    -0.07
     antic
    -0.07
     altamente
    -0.07
    asia
    -0.07
     tan
    -0.07
    Operations
    -0.07
     عالي
    -0.07
    وا
    -0.07
    POSITIVE LOGITS
     retra
    0.10
     mortgages
    0.09
     부담
    0.08
     부탁
    0.08
     Zuschauer
    0.08
     책임
    0.08
    0.08
     zomb
    0.08
     raconter
    0.08
     Geschichten
    0.08
    Act Density 0.011%

    No Known Activations