INDEX
    Explanations

    turn markers and conversation

    instructions that define roles, structure, and formatting with numeric constraints or durations, especially outline-like headings, time markers, and meta-guidance within prompts.

    New Auto-Interp
    Negative Logits
     polytopes
    0.48
     sows
    0.43
     RADIOACTIVE
    0.42
     radiographs
    0.42
     transmutation
    0.40
     vegetative
    0.39
     monomials
    0.38
     transmittance
    0.38
     nitrification
    0.38
    τικός
    0.38
    POSITIVE LOGITS
    1
    0.51
    5
    0.45
    0
    0.45
    8
    0.44
    7
    0.43
    are
    0.43
    4
    0.43
    6
    0.41
    0.40
    2
    0.40
    Act Density 0.759%

    No Known Activations