INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     wants
    -0.07
     대한민국
    -0.06
     manifold
    -0.06
    时候
    -0.06
    Compression
    -0.06
     Pavel
    -0.06
    今年
    -0.06
    /time
    -0.06
    adients
    -0.06
     Much
    -0.06
    POSITIVE LOGITS
    _BOOT
    0.07
     organic
    0.06
     аналіз
    0.06
    tem
    0.06
    0.06
    ils
    0.06
    λευτα
    0.06
     Только
    0.06
     fanc
    0.06
    _rotate
    0.06
    Act Density 0.012%

    No Known Activations