INDEX
    Explanations

    the letter "t" appearing at the end of words

    negative contractions of the phrase "do not" in various contexts

    New Auto-Interp
    Negative Logits
     behavi
    -0.76
     Penguin
    -0.69
     Radiation
    -0.67
    Reviewer
    -0.66
     Leopard
    -0.65
    ses
    -0.61
     Neigh
    -0.60
     disparate
    -0.60
     Sharp
    -0.59
     Scarlet
    -0.59
    POSITIVE LOGITS
    cha
    1.09
    ople
    0.92
    otally
    0.89
    raq
    0.88
    unes
    0.87
    athom
    0.86
    ude
    0.84
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    0.83
     myself
    0.81
    plet
    0.78
    Act Density 0.094%

    No Known Activations