INDEX
    Explanations

    subscripts and superscripts

    New Auto-Interp
    Negative Logits
    -0.08
     mide
    -0.08
     postura
    -0.08
     rasa
    -0.08
    reate
    -0.07
    waiting
    -0.07
     waiting
    -0.07
     først
    -0.07
    dde
    -0.07
     wall
    -0.07
    POSITIVE LOGITS
     brincar
    0.10
     Forschung
    0.08
     Swi
    0.08
     Amo
    0.08
    ,omitempty
    0.08
    ilhe
    0.08
    0.08
    ils
    0.07
     Belo
    0.07
     bermain
    0.07
    Act Density 0.003%

    No Known Activations