INDEX
    Explanations

    the use of first-person pronouns indicating personal involvement or experiences

    New Auto-Interp
    Negative Logits
    =-=-=-=-
    -0.71
    illac
    -0.66
     sacrific
    -0.64
    flix
    -0.64
     horizont
    -0.60
     wherein
    -0.60
    hedon
    -0.59
    dfx
    -0.58
     Hayward
    -0.58
     Camer
    -0.57
    POSITIVE LOGITS
    not
    0.76
     suppose
    0.70
    iking
    0.69
    starting
    0.69
    reporting
    0.67
    ussian
    0.67
    nt
    0.67
    eling
    0.67
    DEN
    0.66
    still
    0.65
    Act Density 0.069%

    No Known Activations