INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    panjang
    -1.16
     advantageous
    -1.15
     Đông
    -1.13
     kveld
    -1.05
     besø
    -1.05
     But
    -1.02
    Kön
    -1.02
    ēju
    -1.02
     Why
    -1.00
     jogar
    -1.00
    POSITIVE LOGITS
    often
    1.32
     sempre
    1.18
     FUCKING
    1.14
    including
    1.10
    ating
    1.10
     nuovo
    1.09
    usually
    1.08
    still
    1.07
    IA
    1.05
     asla
    1.04
    Act Density 0.088%

    No Known Activations