INDEX
    Explanations

    references to animal features and characteristics, particularly wings and tails

    New Auto-Interp
    Negative Logits
    obao
    -0.18
    668
    -0.16
    imers
    -0.16
    /pdf
    -0.16
    berapa
    -0.16
    ichick
    -0.16
    бав
    -0.15
    iseum
    -0.15
    ogue
    -0.15
    svp
    -0.15
    POSITIVE LOGITS
    less
    0.21
     n
    0.16
     grace
    0.15
    TI
    0.15
    -equipped
    0.15
    ate
    0.14
    ello
    0.14
    ia
    0.14
    lessness
    0.14
     lou
    0.13
    Act Density 0.053%

    No Known Activations