INDEX
    Explanations

    questions regarding beliefs and answers related to knowledge or understanding

    New Auto-Interp
    Negative Logits
    ftagPool
    -0.59
     للمعارف
    -0.53
    EDEFAULT
    -0.51
    ConstraintMaker
    -0.50
    ITHUB
    -0.49
     nahilalakip
    -0.49
    ніципалі
    -0.49
     muualla
    -0.48
    layoutControl
    -0.48
     tartalomajánló
    -0.47
    POSITIVE LOGITS
     answer
    3.75
     answers
    3.39
     answered
    3.13
    answer
    3.11
     Answer
    3.00
     answering
    2.88
    Answer
    2.81
     ANSWER
    2.81
     réponse
    2.64
     Answers
    2.63
    Act Density 0.777%

    No Known Activations