INDEX
    Explanations

    forms of to be

    New Auto-Interp
    Negative Logits
     sitzen
    -0.08
     bending
    -0.08
    uchar
    -0.08
     herb
    -0.07
    -0.07
     bier
    -0.07
     stamina
    -0.07
     buhay
    -0.07
     funktionieren
    -0.07
     banyak
    -0.07
    POSITIVE LOGITS
    지를
    0.08
     relativa
    0.08
     noire
    0.08
     персона
    0.08
     Gar
    0.08
     Cinderella
    0.08
     происх
    0.08
    0.07
     Kiss
    0.07
    ний
    0.07
    Act Density 0.014%

    No Known Activations