INDEX
    Explanations

    instances where an expectation is surpassed or not met

    instances of the word "expected" and its variations

    New Auto-Interp
    Negative Logits
    tex
    -0.73
    nan
    -0.71
    neck
    -0.71
    manship
    -0.71
    tha
    -0.70
    fighting
    -0.69
    reen
    -0.69
    vet
    -0.69
    agra
    -0.68
    below
    -0.68
    POSITIVE LOGITS
    ORY
    0.75
    FontSize
    0.74
     laughter
    0.68
    ICAL
    0.68
     unexpected
    0.64
    IAL
    0.64
    OSH
    0.63
    ROR
    0.62
     spont
    0.61
    ICAN
    0.60
    Act Density 0.031%

    No Known Activations