INDEX
    Explanations

    expressions of requests and communication

    New Auto-Interp
    Negative Logits
     supposedly
    -0.17
     hence
    -0.16
     deemed
    -0.16
     albeit
    -0.16
    orough
    -0.15
     allegedly
    -0.14
    .vo
    -0.14
    Upon
    -0.14
     Hence
    -0.14
     Throughout
    -0.14
    POSITIVE LOGITS
     begin
    0.21
     near
    0.19
     nearly
    0.18
     begins
    0.17
     bec
    0.17
     enjo
    0.17
     began
    0.16
     comport
    0.16
     beginning
    0.16
    near
    0.16
    Act Density 0.079%

    No Known Activations