INDEX
    Explanations

    phrases indicating significant involvement or contribution to various contexts

    New Auto-Interp
    Negative Logits
    Exact
    -0.14
    inel
    -0.14
    ritten
    -0.14
    lesen
    -0.14
     Bilder
    -0.14
    .hr
    -0.14
    rap
    -0.13
     tinder
    -0.13
    ril
    -0.13
    immers
    -0.13
    POSITIVE LOGITS
     shaping
    0.16
    opak
    0.15
    kaar
    0.15
    extr
    0.15
    omu
    0.14
    owan
    0.14
    ommen
    0.14
    ubah
    0.14
    Verb
    0.14
    emark
    0.14
    Act Density 0.145%

    No Known Activations