INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -3.11
    -3.09
     That
    -2.25
    G
    -2.20
    <bos>
    -2.08
     ”
    -2.08
     is
    -2.05
    S
    -2.03
    <em>
    -1.95
    E
    -1.94
    POSITIVE LOGITS
    ية
    2.23
     插画
    2.20
    2.17
     oiseau
    2.16
    2.13
    bzw
    2.11
    2.05
    issement
    2.05
    2.05
     内衣
    2.00
    Act Density 0.010%

    No Known Activations