INDEX
    Explanations

    descriptions of physical attributes and characteristics

    New Auto-Interp
    Negative Logits
     Tun
    -0.16
     Tub
    -0.16
    unger
    -0.16
     Face
    -0.16
    AccessType
    -0.16
     Toggle
    -0.16
    éĿ¢
    -0.15
     face
    -0.15
    Face
    -0.15
    é¡Ķ
    -0.15
    POSITIVE LOGITS
     tail
    0.98
    tail
    0.86
     Tail
    0.84
     tails
    0.81
    Tail
    0.81
    _tail
    0.70
    å°¾
    0.70
    .tail
    0.66
    tails
    0.66
    TAIL
    0.63
    Act Density 0.077%

    No Known Activations