INDEX
    Explanations

    repeated use of the word "have" in various contexts

    New Auto-Interp
    Negative Logits
    owed
    -0.15
    à¹ģล
    -0.15
    ued
    -0.15
    MMdd
    -0.15
    loquent
    -0.14
    iseum
    -0.14
    ãĥ³ãĥ
    -0.14
    med
    -0.14
    deps
    -0.14
    ufen
    -0.14
    POSITIVE LOGITS
     recourse
    0.20
     conversations
    0.20
     access
    0.20
     someone
    0.20
     fun
    0.19
     them
    0.18
     conversation
    0.18
     sex
    0.17
     regard
    0.17
    geç
    0.17
    Act Density 0.085%

    No Known Activations