INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ן
    1.38
    ai
    1.27
    you
    1.24
     you
    1.18
    t
    1.13
    ם
    1.05
    ha
    1.04
    1.02
    a
    0.99
    ica
    0.97
    POSITIVE LOGITS
    EN
    1.29
    ES
    1.26
    WEEK
    1.23
    OT
    1.18
    ER
    1.16
    AS
    1.12
    1.12
     week
    1.08
    ET
    1.08
    AN
    1.06
    Act Density 0.023%

    No Known Activations