INDEX
    Explanations

    references to different types of animals

    New Auto-Interp
    Negative Logits
    minster
    -0.77
    lder
    -0.70
    nance
    -0.68
    heit
    -0.66
    âĸ¬
    -0.65
     Bauer
    -0.65
     Sutherland
    -0.64
    nee
    -0.63
    nant
    -0.63
    nder
    -0.62
    POSITIVE LOGITS
     animals
    1.35
     Animals
    1.16
     mammals
    1.10
     animal
    1.03
     carc
    1.03
    animal
    1.02
     apes
    0.98
     reptiles
    0.95
    brates
    0.93
     primates
    0.93
    Act Density 0.014%

    No Known Activations