INDEX
    Explanations

    negative phrases or expressions of reluctance

    New Auto-Interp
    Negative Logits
    emma
    -0.15
    iceps
    -0.15
    arez
    -0.14
    Conexion
    -0.14
     Humph
    -0.14
     McGr
    -0.14
    NAS
    -0.14
     extr
    -0.13
    arga
    -0.13
     shar
    -0.13
    POSITIVE LOGITS
    respect
    0.14
    atk
    0.14
    tom
    0.14
    rary
    0.14
    ulary
    0.14
    nelle
    0.14
    digest
    0.13
    sexy
    0.13
    vely
    0.13
    encil
    0.13
    Act Density 0.000%

    No Known Activations