INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.10
     deelt
    -0.09
     வீர
    -0.08
     apartments
    -0.08
    -0.08
    ymo
    -0.08
    -0.08
     Мал
    -0.08
     married
    -0.08
    夫妻
    -0.08
    POSITIVE LOGITS
     pervasive
    0.08
     verzichten
    0.08
     porous
    0.08
     lurking
    0.07
     dictum
    0.07
    は禁止
    0.07
     looming
    0.07
     Intel
    0.07
     prohibition
    0.07
     лиш
    0.07
    Act Density 0.004%

    No Known Activations