INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    snap
    -0.07
    okus
    -0.06
     kittens
    -0.06
     Maher
    -0.06
    locks
    -0.06
      
    -0.06
    Learning
    -0.06
     Audi
    -0.06
    DG
    -0.06
    _ARGS
    -0.06
    POSITIVE LOGITS
    IFORM
    0.07
     benign
    0.07
     cessation
    0.07
     έως
    0.07
    cpy
    0.06
    (completion
    0.06
     muscles
    0.06
     üst
    0.06
     ordinary
    0.06
    -party
    0.06
    Act Density 0.016%

    No Known Activations