INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Rover
    -0.08
     pêche
    -0.08
     Rotate
    -0.08
     повер
    -0.08
     Neck
    -0.08
     rotate
    -0.07
     aggior
    -0.07
    -0.07
    orlu
    -0.07
     несов
    -0.07
    POSITIVE LOGITS
     costing
    0.09
     pellets
    0.08
    pak
    0.08
    _dummy
    0.08
    .embed
    0.08
    Easy
    0.08
     volum
    0.08
     filler
    0.08
    ost
    0.08
    _stub
    0.08
    Act Density 0.005%

    No Known Activations