INDEX
    Explanations

    occurrences of the word "interaction" and its variations that emphasize engagement and involvement

    New Auto-Interp
    Negative Logits
       
    -0.19
    venir
    -0.16
    Ñľ
    -0.16
    former
    -0.16
    ongs
    -0.16
    ging
    -0.15
    /arch
    -0.15
     Byl
    -0.14
    jours
    -0.14
    áºŃt
    -0.14
    POSITIVE LOGITS
    ivate
    0.25
    ives
    0.25
    ively
    0.24
    iveness
    0.23
    ype
    0.21
    uator
    0.20
    al
    0.20
    å¼ı
    0.19
    ivity
    0.19
    uality
    0.19
    Act Density 0.017%

    No Known Activations