INDEX
    Explanations

    references to fur or furry-related content

    New Auto-Interp
    Negative Logits
    aal
    -0.17
    es
    -0.16
    egasus
    -0.16
    lifting
    -0.16
    odia
    -0.15
     perse
    -0.15
    chk
    -0.15
    emit
    -0.15
     century
    -0.15
     Century
    -0.15
    POSITIVE LOGITS
    thest
    0.27
    iously
    0.23
    ioso
    0.23
    iosa
    0.22
    riers
    0.22
    rier
    0.20
    thers
    0.19
    phy
    0.19
    fur
    0.18
    uristic
    0.18
    Act Density 0.005%

    No Known Activations