INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    -0.07
     Marilyn
    -0.06
    >You
    -0.06
    -0.06
     facile
    -0.06
     efect
    -0.06
     negativity
    -0.06
    -0.06
    udies
    -0.06
    POSITIVE LOGITS
    .define
    0.07
    0.07
     dùng
    0.06
    .addClass
    0.06
     recursive
    0.06
     capacit
    0.06
    "\
    0.06
    �細
    0.06
    but
    0.06
     привод
    0.06
    Act Density 0.004%

    No Known Activations