INDEX
    Explanations

    phrases associated with surprising or unexpected outcomes

    New Auto-Interp
    Negative Logits
    tagHelperRunner
    -0.61
    AddTagHelper
    -0.57
    قایناق‌لار
    -0.53
     autorytatywna
    -0.50
     متعلقه
    -0.48
    hoeddwyd
    -0.47
    Geograf
    -0.45
     hro
    -0.45
     Linton
    -0.45
    hdysval
    -0.45
    POSITIVE LOGITS
     surprise
    1.39
    surprise
    1.27
     surprised
    1.26
     surprises
    1.25
     unsur
    1.24
     Surprise
    1.24
    Surprise
    1.19
     surpresa
    1.16
    Surprised
    1.15
     sorpresa
    1.14
    Act Density 0.229%

    No Known Activations