INDEX
    Explanations

    German words indicating actions and descriptions, particularly those related to human behavior and circumstances

    New Auto-Interp
    Negative Logits
    ettel
    -0.20
    uite
    -0.16
     Richardson
    -0.15
    azer
    -0.15
    abus
    -0.15
    езд
    -0.14
     jd
    -0.14
    icher
    -0.14
    lass
    -0.14
    dna
    -0.14
    POSITIVE LOGITS
    iert
    0.26
    agt
    0.25
    elt
    0.25
    gt
    0.25
    аеÑĤ
    0.24
    igt
    0.24
    ibt
    0.23
    ÑĢÑĥеÑĤ
    0.23
    ÑĥÑĶ
    0.23
    ÑĭваеÑĤ
    0.23
    Act Density 0.034%

    No Known Activations