INDEX
    Explanations

    discourse about assumptions and recognition of responsibility in communication

    New Auto-Interp
    Negative Logits
    ibern
    -0.15
    retty
    -0.15
    kok
    -0.14
    textfield
    -0.14
    ARGS
    -0.14
     вÑĢемен
    -0.14
    argin
    -0.14
     convin
    -0.13
    exampleInput
    -0.13
    .oc
    -0.13
    POSITIVE LOGITS
     implication
    0.40
     implications
    0.31
     implied
    0.30
     imply
    0.28
     implies
    0.28
     IMPLIED
    0.28
     implying
    0.25
     suggestion
    0.23
     infer
    0.22
     hint
    0.22
    Act Density 0.106%

    No Known Activations