INDEX
    Explanations

    magic/fantastical powers

    New Auto-Interp
    Negative Logits
    NotEmpty
    -0.07
    بش
    -0.07
    -0.07
    Loading
    -0.07
    х
    -0.06
    -0.06
     sexist
    -0.06
    警察
    -0.06
    -0.06
     quotations
    -0.06
    POSITIVE LOGITS
    =\"%
    0.07
    _motor
    0.07
    Carl
    0.07
    (Image
    0.06
    _PIXEL
    0.06
    (Name
    0.06
     名前
    0.06
    -defense
    0.06
     totalmente
    0.06
    (CC
    0.06
    Act Density 0.039%

    No Known Activations