INDEX
    Explanations

    references to faces and facial expressions

    New Auto-Interp
    Negative Logits
    onse
    -0.18
    cedure
    -0.17
    овÑĸд
    -0.16
    ards
    -0.15
    ieder
    -0.15
    lea
    -0.15
    ongsTo
    -0.15
    viso
    -0.15
    edException
    -0.15
    olid
    -0.15
    POSITIVE LOGITS
    (face
    0.20
    -face
    0.18
    /head
    0.18
     Rooney
    0.18
     faces
    0.18
     ëĦ¤
    0.15
     facial
    0.15
    /body
    0.14
    verbatim
    0.14
     Faces
    0.14
    Act Density 0.035%

    No Known Activations