INDEX
    Explanations

    punctuation marks

    the occurrence of the word "However."

    New Auto-Interp
    Negative Logits
    roy
    -0.65
    oire
    -0.64
    SI
    -0.63
    into
    -0.63
    ULAR
    -0.61
    gall
    -0.61
    blank
    -0.60
    TP
    -0.59
    AZ
    -0.58
    SourceFile
    -0.57
    POSITIVE LOGITS
     alas
    1.05
    chery
    0.96
     unlike
    0.91
     according
    0.88
     beware
    0.85
     interestingly
    0.84
     nevertheless
    0.81
     fortunately
    0.81
     owing
    0.79
     despite
    0.79
    Act Density 0.099%

    No Known Activations