INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    gli
    -0.08
     spectral
    -0.08
     intric
    -0.07
     basically
    -0.07
     yahay
    -0.07
     enough
    -0.07
     macam
    -0.07
     ACE
    -0.07
     phosphate
    -0.07
     buffalo
    -0.07
    POSITIVE LOGITS
     Uncomment
    0.14
     uncomment
    0.11
     deseas
    0.10
     möchtest
    0.10
     desejar
    0.10
     möchten
    0.10
     хотите
    0.09
    nasium
    0.09
     ønsk
    0.09
     ترغب
    0.09
    Act Density 0.044%

    No Known Activations