INDEX
    Explanations

    phrases related to communication and giving instructions

    instances of direct speech or reported dialogue

    New Auto-Interp
    Negative Logits
    aghd
    -0.72
    oshenko
    -0.71
    idates
    -0.69
    ses
    -0.61
     exploits
    -0.61
     controversies
    -0.60
    obos
    -0.57
    ãĥ³
    -0.57
     operates
    -0.56
    ¥µ
    -0.56
    POSITIVE LOGITS
     myself
    1.96
     my
    1.50
     mine
    1.03
     MY
    0.88
    yss
    0.85
     My
    0.77
     ourselves
    0.75
    oan
    0.73
     him
    0.73
    my
    0.72
    Act Density 0.446%

    No Known Activations