INDEX
    Explanations

    words related to past actions or completed tasks

    phrases related to academic or formal communication

    New Auto-Interp
    Negative Logits
    eatures
    -0.58
    ghai
    -0.57
    quartered
    -0.53
     Sundays
    -0.52
    edIn
    -0.50
     hospitality
    -0.49
    rely
    -0.49
     pilgr
    -0.49
    ofi
    -0.48
     Saturdays
    -0.48
    POSITIVE LOGITS
    /,
    0.75
     (!
    0.73
    ;
    0.72
    .
    0.68
     haha
    0.67
     lol
    0.67
     :)
    0.66
    !:
    0.65
     ("
    0.63
     etc
    0.63
    Act Density 1.186%

    No Known Activations