INDEX
    Explanations

    phrases indicating willingness for engagement or interaction

    expressions that encourage openness and communication

    New Auto-Interp
    Negative Logits
    rament
    -0.77
     Templ
    -0.64
    ources
    -0.60
     Canaver
    -0.59
     Clement
    -0.57
     McDonnell
    -0.55
    ¬¼
    -0.55
    etheless
    -0.54
    oaded
    -0.54
    SourceFile
    -0.53
    POSITIVE LOGITS
    inct
    0.74
    itsu
    0.70
    INE
    0.66
    in
    0.63
    inal
    0.62
    inem
    0.62
    ins
    0.62
    ine
    0.61
    inates
    0.60
    inyl
    0.60
    Act Density 0.152%

    No Known Activations