INDEX
    Explanations

    "one" followed by common words

    New Auto-Interp
    Negative Logits
    1.41
    u
    1.27
    er
    1.26
    ang
    1.25
    al
    1.23
    in
    1.20
    p
    1.17
    era
    1.14
    bred
    1.14
    i
    1.13
    POSITIVE LOGITS
     unsere
    1.19
    ING
    1.17
    Cuando
    1.15
    D
    1.15
    G
    1.14
    Entonces
    1.13
    ك
    1.13
    Ich
    1.12
    1.12
    Y
    1.11
    Act Density 0.015%

    No Known Activations