INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    los
    -0.07
    nas
    -0.07
    YTE
    -0.06
    elize
    -0.06
     rainfall
    -0.06
    lea
    -0.06
     Orwell
    -0.06
     verst
    -0.06
     Projekt
    -0.06
    <Type
    -0.06
    POSITIVE LOGITS
     around
    0.06
     дол
    0.06
    -round
    0.06
    geme
    0.06
    #SBATCH
    0.06
    ้ไข
    0.06
    0.06
    _google
    0.06
    切り
    0.06
     Crate
    0.06
    Act Density 0.004%

    No Known Activations