INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     viruses
    -0.08
     og
    -0.08
     eel
    -0.08
     double
    -0.08
     breast
    -0.08
     еж
    -0.07
     вирус
    -0.07
     chimp
    -0.07
     virus
    -0.07
     окру
    -0.07
    POSITIVE LOGITS
     weighting
    0.12
     weighing
    0.11
     weigh
    0.11
    .weight
    0.11
     weighed
    0.10
     Weighted
    0.10
     भार
    0.10
     weighted
    0.10
     وزن
    0.09
    weighted
    0.09
    Act Density 0.005%

    No Known Activations