INDEX
    Explanations

    questions that challenge personal accountability or correctness

    New Auto-Interp
    Negative Logits
    -scrollbar
    -0.13
    á»ĥ
    -0.13
    xAC
    -0.13
    umph
    -0.13
    ChangeEvent
    -0.12
    uled
    -0.12
    CallCheck
    -0.12
    /inet
    -0.12
    äºĪ
    -0.12
    kke
    -0.12
    POSITIVE LOGITS
     questions
    0.96
     question
    0.96
     Questions
    0.81
     Question
    0.77
    questions
    0.75
    question
    0.72
    -question
    0.69
     Frage
    0.69
     QUESTION
    0.67
     ask
    0.67
    Act Density 1.042%

    No Known Activations