INDEX
    Explanations

    Code/Informal writing

    requests for poems and the assistant’s introductory phrasing when presenting a poem

    New Auto-Interp
    Negative Logits
     vài
    -0.08
    不管是
    -0.08
    قسام
    -0.08
    不會
    -0.08
     Dict
    -0.07
    plash
    -0.07
    heets
    -0.07
     mọi
    -0.07
    Most
    -0.07
    -0.07
    POSITIVE LOGITS
     מלא
    0.08
    ĥ
    0.07
     SEL
    0.07
    0.06
     그것
    0.06
     MyBase
    0.06
    UserService
    0.06
    _passed
    0.06
    0.06
     vessels
    0.06
    Act Density 0.139%

    No Known Activations