INDEX
    Explanations

    phrases that indicate generalizations or normative statements

    words related to general statements or introductions, typically starting with "Generally" or "Initially"

    New Auto-Interp
    Negative Logits
    ggles
    -0.79
    aily
    -0.77
    kamp
    -0.74
    kefeller
    -0.71
    ocaust
    -0.66
    addons
    -0.65
    umbn
    -0.65
    "},"
    -0.63
    natureconservancy
    -0.63
    uras
    -0.62
    POSITIVE LOGITS
     speaking
    1.16
    adays
    0.90
    ,
    0.88
     there
    0.87
     we
    0.80
     they
    0.78
     it
    0.77
    ,.
    0.75
     Speaking
    0.74
    speaking
    0.70
    Act Density 0.173%

    No Known Activations