INDEX
    Explanations

    asking for clarification or offers of help

    New Auto-Interp
    Negative Logits
    no
    0.66
     ignoring
    0.64
    must
    0.64
    enemy
    0.58
     murderous
    0.58
     tyranny
    0.56
     obeyed
    0.56
     oppressive
    0.56
     delusion
    0.55
     doomed
    0.54
    POSITIVE LOGITS
     Fragen
    1.08
     Questions
    1.07
     preguntas
    1.03
     informacje
    1.00
     questions
    1.00
     inquiries
    0.99
     질문
    0.99
    Questions
    0.99
     summaries
    0.97
     sorular
    0.96
    Act Density 5.554%

    No Known Activations