INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     maten
    -0.08
    genoot
    -0.08
    -0.08
     triunfo
    -0.08
    fox
    -0.07
     publico
    -0.07
    -0.07
     triun
    -0.07
     freqü
    -0.07
    porte
    -0.07
    POSITIVE LOGITS
    _Work
    0.09
    .Work
    0.08
     indruk
    0.08
     Workspace
    0.07
    Workspace
    0.07
    突破
    0.07
     ange
    0.07
     تا
    0.07
     вот
    0.07
    яд
    0.07
    Act Density 0.001%

    No Known Activations