INDEX
    Explanations

    phrases indicating comparison or evaluation of various situations or conditions

    New Auto-Interp
    Negative Logits
    inge
    -0.14
    elix
    -0.14
    acz
    -0.14
    bilder
    -0.13
    oples
    -0.13
    обÑĢаз
    -0.13
    lein
    -0.13
    iesel
    -0.13
    iri
    -0.13
    au
    -0.13
    POSITIVE LOGITS
    argar
    0.18
    oose
    0.16
    arger
    0.16
    edik
    0.16
    mps
    0.16
    ække
    0.16
    ikel
    0.15
    еÑħ
    0.14
    ÐĺТ
    0.14
    odyn
    0.14
    Act Density 0.019%

    No Known Activations