INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     terro
    -0.08
    Visitor
    -0.07
     visits
    -0.07
    lood
    -0.07
    (TEST
    -0.07
    pts
    -0.07
    _walk
    -0.07
     fleas
    -0.07
     Kontakt
    -0.07
     during
    -0.07
    POSITIVE LOGITS
     mbo
    0.09
     zb
    0.08
     लगे
    0.08
     száz
    0.08
    0.08
     వెంట
    0.07
     emi
    0.07
     wol
    0.07
    ме
    0.07
     weten
    0.07
    Act Density 0.001%

    No Known Activations