INDEX
    Explanations

    variations of the word "accept" and its related forms

    New Auto-Interp
    Negative Logits
    alnız
    -0.17
    inst
    -0.15
    exact
    -0.15
    oley
    -0.15
    cape
    -0.15
    र
    -0.15
    dling
    -0.14
    dy
    -0.14
    fur
    -0.14
    олÑı
    -0.14
    POSITIVE LOGITS
    ance
    0.34
    ably
    0.32
     responsibility
    0.26
    ances
    0.26
    ANCE
    0.25
    ively
    0.21
    reject
    0.20
    able
    0.20
    eer
    0.19
     ÑĥÑĩаÑģÑĤие
    0.19
    Act Density 0.046%

    No Known Activations