INDEX
    Explanations

    phrases indicating communication or instructions being given

    New Auto-Interp
    Negative Logits
    oshenko
    -0.71
    idates
    -0.61
     operates
    -0.60
    egal
    -0.58
    ses
    -0.57
     nowadays
    -0.56
     notoriously
    -0.55
    ouls
    -0.54
     understandably
    -0.53
     occupies
    -0.52
    POSITIVE LOGITS
     myself
    1.59
     my
    1.19
     him
    1.02
     ourselves
    0.98
     them
    0.89
    ucc
    0.80
     mine
    0.78
     THEM
    0.77
    igree
    0.69
    yss
    0.69
    Act Density 0.534%

    No Known Activations