INDEX
    Explanations

    short phrases starting with specific keywords

    repetition of the same character or symbol, particularly the empty token

    New Auto-Interp
    Negative Logits
    xxxx
    -0.74
    è£
    -0.70
     respons
    -0.70
     thereafter
    -0.68
    XXXX
    -0.66
     thereof
    -0.64
     disg
    -0.63
     thereto
    -0.63
     compe
    -0.61
     encour
    -0.60
    POSITIVE LOGITS
     Expand
    0.81
    zbollah
    0.80
     Answer
    0.80
     SHARES
    0.74
     Updated
    0.73
     Facts
    0.72
     Vegan
    0.72
     Recipe
    0.69
    resa
    0.68
    Wiki
    0.67
    Act Density 0.242%

    No Known Activations