INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     produziert
    -1.31
     his
    -1.26
    就可以
    -1.23
    s
    -1.23
     hecho
    -1.23
     realizado
    -1.20
    olkien
    -1.18
    son
    -1.17
    …"
    -1.16
    brtc
    -1.15
    POSITIVE LOGITS
    Gobierno
    1.46
    Día
    1.36
    muñeca
    1.26
    ilustración
    1.23
    1.21
    เกง
    1.20
     gorro
    1.16
    1.12
    ǔ
    1.12
    ınıza
    1.12
    Act Density 0.095%

    No Known Activations