INDEX
    Explanations

    references to primates and their characteristics

    New Auto-Interp
    Negative Logits
    illes
    -0.18
    é±¼
    -0.16
    èįī
    -0.16
    sword
    -0.16
    fir
    -0.16
    omes
    -0.15
    lems
    -0.15
    lies
    -0.15
    itel
    -0.14
    bau
    -0.14
    POSITIVE LOGITS
     monkey
    0.23
     monkeys
    0.23
    Monkey
    0.22
     Monkey
    0.21
    monkey
    0.21
     tree
    0.20
     ape
    0.19
    -human
    0.19
     Tree
    0.17
    arch
    0.16
    Act Density 0.052%

    No Known Activations