INDEX
    Explanations

    references to specific animals or animal-related terms

    camel, giraffe, alpaca, ass, mule

    New Auto-Interp
    Negative Logits
    icitis
    -0.39
    exitRule
    -0.38
    iyaki
    -0.37
    discre
    -0.37
     disclosed
    -0.36
     iParam
    -0.36
    :✨
    -0.36
     disclosures
    -0.35
    Alder
    -0.35
     chande
    -0.35
    POSITIVE LOGITS
     camel
    0.99
    camel
    0.93
     ostrich
    0.89
    🐫
    0.85
     Camel
    0.82
    Camel
    0.81
     camels
    0.79
    🐪
    0.78
     hump
    0.73
    0.69
    Act Density 0.023%

    No Known Activations