INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     рек
    -0.07
    SpecWarn
    -0.07
     opendir
    -0.06
     WALL
    -0.06
    空间
    -0.06
     represent
    -0.06
     Kul
    -0.06
     Ulus
    -0.06
    Isl
    -0.06
    -0.06
    POSITIVE LOGITS
     batch
    0.16
     Batch
    0.13
    _batch
    0.13
    Batch
    0.12
    batch
    0.12
    .batch
    0.10
     batches
    0.09
    atch
    0.09
    _batches
    0.09
    =batch
    0.08
    Act Density 0.005%

    No Known Activations