INDEX
    Explanations

    occurrences of the word "number" and related numerical terms

    New Auto-Interp
    Negative Logits
    eward
    -0.21
    born
    -0.17
    ward
    -0.16
    utin
    -0.16
    eniz
    -0.16
    ัมà¸ŀ
    -0.16
     manner
    -0.15
    wards
    -0.15
    avr
    -0.15
    ovy
    -0.15
    POSITIVE LOGITS
    erable
    0.20
    icer
    0.17
    erb
    0.17
    rients
    0.16
    .gdx
    0.15
    /address
    0.15
    aciones
    0.15
    exion
    0.15
    óż
    0.15
    velle
    0.15
    Act Density 0.083%

    No Known Activations