INDEX
    Explanations

    interview transcripts

    chat-format metadata that marks the assistant’s turn, especially the closing assistant header delimiter.

    New Auto-Interp
    Negative Logits
    INGS
    -0.08
    OTOR
    -0.07
    ienza
    -0.07
    hi
    -0.07
    (client
    -0.07
     sw
    -0.07
    coins
    -0.06
    .FILE
    -0.06
    -0.06
    üt
    -0.06
    POSITIVE LOGITS
     [];
    ↵
    0.07
     Retrieved
    0.06
     Inside
    0.06
    retweeted
    0.06
     الذين
    0.06
    .Resize
    0.06
    :@"%
    0.06
    minent
    0.06
     derec
    0.06
     LOGGER
    0.06
    Act Density 0.145%

    No Known Activations