INDEX
    Explanations

    phrases indicating a sense of obligation or dependency

    New Auto-Interp
    Negative Logits
    tü
    -0.17
    auen
    -0.15
    veis
    -0.15
    aiser
    -0.14
    zeich
    -0.14
    .mag
    -0.14
    auf
    -0.13
    anik
    -0.13
    ÅĽci
    -0.13
    createView
    -0.13
    POSITIVE LOGITS
     us
    0.23
     them
    0.20
     many
    0.18
    us
    0.17
    usat
    0.16
     him
    0.16
     everyone
    0.15
    usan
    0.14
     Spar
    0.14
     those
    0.14
    Act Density 0.149%

    No Known Activations