INDEX
    Explanations

    yes/no questions and related inquiry patterns

    New Auto-Interp
    Negative Logits
    hof
    -0.16
     sonst
    -0.14
    lette
    -0.14
    айд
    -0.14
    ikk
    -0.14
    isten
    -0.13
    pond
    -0.13
    ensch
    -0.13
    agt
    -0.13
    atable
    -0.13
    POSITIVE LOGITS
     answer
    0.21
    Answer
    0.16
     Answer
    0.16
    uner
    0.16
    çŃĶ
    0.15
     answered
    0.15
    деÑĤ
    0.15
    ningen
    0.15
    swer
    0.15
    unas
    0.15
    Act Density 0.042%

    No Known Activations