INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tein
    -0.07
     Beet
    -0.06
    pink
    -0.06
     merchants
    -0.06
    共和
    -0.06
     kalk
    -0.06
     Kaw
    -0.06
    -cn
    -0.06
    whel
    -0.06
     bush
    -0.06
    POSITIVE LOGITS
     hypnot
    0.07
    rschein
    0.07
     refusal
    0.07
    →→
    0.07
    ouncill
    0.06
    0.06
     combine
    0.06
    ahrungen
    0.06
    0.06
     ویژ
    0.06
    Act Density 0.005%

    No Known Activations