INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     endeavors
    -0.08
    ението
    -0.08
     tập
    -0.07
     WH
    -0.07
    wetten
    -0.07
    -songwriter
    -0.07
    -0.07
     объедин
    -0.07
     hopes
    -0.07
     bán
    -0.07
    POSITIVE LOGITS
     tentang
    0.09
     wording
    0.08
    关于
    0.08
     mengenai
    0.08
    About
    0.08
     بشأن
    0.07
     About
    0.07
    Sobre
    0.07
    เกี่ยว
    0.07
     revise
    0.07
    Act Density 0.014%

    No Known Activations