INDEX
    Explanations

    past participles and past tense verbs

    New Auto-Interp
    Negative Logits
    лаÑģÑĤи
    -0.16
    èŃĺ
    -0.15
    Äįet
    -0.14
    inyin
    -0.14
    zelf
    -0.14
    ward
    -0.14
    à¸Ľà¸£à¸°à¸¡à¸²à¸ĵ
    -0.14
    _procs
    -0.14
    ing
    -0.14
    bild
    -0.13
    POSITIVE LOGITS
     up
    0.30
     away
    0.29
     out
    0.28
     down
    0.28
     into
    0.27
    -up
    0.27
    -down
    0.25
     off
    0.25
    -out
    0.24
    -off
    0.22
    Act Density 0.027%

    No Known Activations