INDEX
    Explanations

    instances where the text discusses the possibility or lack thereof of a certain condition or event

    words indicating uncertainty or conditions regarding existence and actions

    New Auto-Interp
    Negative Logits
    "],
    -0.71
    ]),
    -0.68
    "),
    -0.67
    ]).
    -0.65
    "},
    -0.65
    "))
    -0.64
    estern
    -0.63
    anwhile
    -0.60
    icators
    -0.60
    ',"
    -0.60
    POSITIVE LOGITS
    ,
    0.83
    *,
    0.75
    ,,
    0.70
    ,.
    0.69
    .,
    0.65
    âĢķ
    0.58
    DERR
    0.57
    !,
    0.57
    gall
    0.57
    .*
    0.54
    Act Density 0.885%

    No Known Activations