INDEX
    Explanations

    mentions of fur and related terms

    New Auto-Interp
    Negative Logits
    chk
    -0.17
    es
    -0.16
     century
    -0.16
    aal
    -0.16
    listed
    -0.16
    ez
    -0.16
    lifting
    -0.15
    egasus
    -0.15
    y
    -0.15
     Century
    -0.15
    POSITIVE LOGITS
    thest
    0.26
    ioso
    0.23
    iously
    0.23
    iosa
    0.22
    thers
    0.21
    rier
    0.21
    riers
    0.20
    fur
    0.18
    phy
    0.18
     fur
    0.18
    Act Density 0.006%

    No Known Activations