INDEX
    Explanations

    mentions of animals and their-related concepts

    New Auto-Interp
    Negative Logits
    mund
    -0.16
    огод
    -0.16
    enance
    -0.15
    atform
    -0.15
    eday
    -0.15
    indr
    -0.15
    atch
    -0.15
    edy
    -0.14
    684
    -0.14
    bies
    -0.14
    POSITIVE LOGITS
    /people
    0.21
    istic
    0.18
    arendra
    0.15
    st
    0.15
    ause
    0.14
     kingdom
    0.14
    -rights
    0.14
    hud
    0.13
    .fig
    0.13
    üstü
    0.13
    Act Density 0.047%

    No Known Activations