INDEX
    Explanations

    references to faces or facial features

    New Auto-Interp
    Negative Logits
    баÑĩ
    -0.18
    edException
    -0.17
    ilos
    -0.17
    esus
    -0.16
    readcr
    -0.16
    sWith
    -0.16
    ling
    -0.16
    enga
    -0.15
    self
    -0.15
    láš
    -0.15
    POSITIVE LOGITS
    plate
    0.28
    /head
    0.25
    plates
    0.22
    /body
    0.21
    less
    0.21
    (book
    0.20
    idon
    0.18
     mask
    0.18
    cloth
    0.18
    piece
    0.17
    Act Density 0.025%

    No Known Activations