INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tartalomajánló
    -0.67
     sostenibilidad
    -0.59
     almofada
    -0.57
    ukone
    -0.56
    BibitemShut
    -0.55
     colina
    -0.54
     Schluß
    -0.54
     Einfluß
    -0.54
    transQ
    -0.53
     müm
    -0.53
    POSITIVE LOGITS
     New
    1.52
    New
    1.19
     North
    0.84
     Нью
    0.74
     뉴
    0.73
     South
    0.72
     Nueva
    0.71
     ني
    0.70
     NEW
    0.69
     ニュー
    0.69
    Act Density 0.016%

    No Known Activations