INDEX
    Explanations

    strings of three asterisks in a row

    instances of the asterisk character and similar symbols

    New Auto-Interp
    Negative Logits
    etheless
    -0.77
     srf
    -0.74
    uces
    -0.73
     gaze
    -0.70
    vation
    -0.69
    oded
    -0.68
     exting
    -0.66
    odes
    -0.66
    grass
    -0.66
    utive
    -0.65
    POSITIVE LOGITS
    ***
    0.79
     ***
    0.78
    edited
    0.75
    HAEL
    0.75
    =-=-
    0.74
     Edited
    0.74
    !/
    0.73
    NEW
    0.73
    PET
    0.71
    TOP
    0.70
    Act Density 0.012%

    No Known Activations