INDEX
    Explanations

    quantifiers and explicit quantities

    New Auto-Interp
    Negative Logits
    ος
    -1.08
    }}^{(
    -0.99
    これが
    -0.95
    mbal
    -0.94
    blis
    -0.94
     Substanz
    -0.93
     Ganzen
    -0.93
    -0.91
    さん
    -0.91
    -0.90
    POSITIVE LOGITS
     both
    1.13
    Both
    1.06
    各有
    1.04
    都是在
    1.02
     copious
    0.99
    both
    0.97
     BOTH
    0.97
     EVERY
    0.97
     считается
    0.97
     TWO
    0.92
    Act Density 0.010%

    No Known Activations