INDEX
    Explanations

    attends to the closing double slashes denoting comments from corresponding opening tokens

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.03
    2:0.01
    3:0.09
    4:0.04
    5:0.02
    6:0.07
    7:0.68
    Negative Logits
    -0.72
     [
    -0.60
     I
    -0.57
     i
    -0.56
     "
    -0.54
     (
    -0.53
    /
    -0.52
     or
    -0.50
     C
    -0.50
     W
    -0.50
    POSITIVE LOGITS
     itſelf
    1.45
     myſelf
    1.39
     Efq
    1.27
     Theſe
    1.27
     themſelves
    1.27
     Monfieur
    1.25
     Houſe
    1.23
     pleaſure
    1.23
    ſelf
    1.21
     himſelf
    1.21
    Act Density 0.041%

    No Known Activations