INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     mathbf
    0.70
    zelfde
    0.68
    ą
    0.68
    zens
    0.68
     Fier
    0.66
     Ciebie
    0.66
     دے
    0.65
     Passing
    0.65
    াধিক
    0.64
     T
    0.64
    POSITIVE LOGITS
     mínima
    0.84
     determinada
    0.79
     limitada
    0.78
    rentes
    0.77
     ligados
    0.76
    为了
    0.75
     için
    0.75
     demanda
    0.75
     errado
    0.75
     flips
    0.74
    Act Density 0.017%

    No Known Activations