INDEX
    Explanations

    negations and expressions of disagreement

    New Auto-Interp
    Negative Logits
     Efq
    -1.14
    RenderAtEndOf
    -1.02
     itſelf
    -1.01
     ―――――
    -1.00
     myſelf
    -0.99
     poffible
    -0.95
     whoſe
    -0.93
     raiſ
    -0.92
     photolibrary
    -0.92
    )";
    
    -0.88
    POSITIVE LOGITS
    0.70
     Not
    0.64
     via
    0.61
     A
    0.61
     not
    0.61
    .
    0.61
     as
    0.60
    !
    0.58
     I
    0.57
    ?
    0.55
    Act Density 0.111%

    No Known Activations