INDEX
    Explanations

    phrases indicating a personal issue or problem

    the repetitive use of the word "this."

    New Auto-Interp
    Negative Logits
    ²¾
    -0.80
    rior
    -0.69
    ankind
    -0.69
    isms
    -0.68
    elcome
    -0.68
    ãĤ·ãĥ£
    -0.67
    master
    -0.65
    idel
    -0.65
    lee
    -0.65
    ãĥĥ
    -0.65
    POSITIVE LOGITS
     kind
    0.98
     scenario
    0.94
     enthusi
    0.94
     trope
    0.93
     happen
    0.92
     type
    0.91
     sort
    0.88
     tactic
    0.84
     stuff
    0.84
     sucker
    0.83
    Act Density 0.231%

    No Known Activations