INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Joan
    -0.08
    一组
    -0.07
     Wu
    -0.07
    insk
    -0.07
    地理
    -0.07
    ǎ
    -0.07
    List
    -0.07
    一字
    -0.07
    Pair
    -0.07
     Muse
    -0.07
    POSITIVE LOGITS
     penetration
    0.09
     אתם
    0.08
     penetrate
    0.08
    0.07
     registrations
    0.07
     DOWN
    0.07
     penetr
    0.07
     mechanically
    0.07
    _then
    0.07
     directed
    0.07
    Act Density 0.006%

    No Known Activations