INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    isy
    -0.07
    MOD
    -0.07
    quotelev
    -0.07
    oref
    -0.07
    -0.07
    ientras
    -0.07
    printf
    -0.06
    ductory
    -0.06
    -0.06
    ards
    -0.06
    POSITIVE LOGITS
     lowest
    0.07
     highest
    0.07
     העליון
    0.07
     bounding
    0.07
    上海
    0.07
    Akt
    0.07
     apartment
    0.06
    很大
    0.06
    猪肉
    0.06
     hometown
    0.06
    Act Density 0.012%

    No Known Activations