INDEX
    Explanations

    phrases indicating a sense of inclusivity or universality

    occurrences of the word "all"

    New Auto-Interp
    Negative Logits
    IDS
    -0.61
    aminer
    -0.60
    bal
    -0.58
    dt
    -0.58
     Caption
    -0.56
    inth
    -0.56
    oute
    -0.56
     FG
    -0.55
    hift
    -0.55
    ahime
    -0.54
    POSITIVE LOGITS
    ocating
    1.18
    igator
    1.15
    uding
    1.11
    usion
    1.04
    igators
    1.03
    usions
    0.98
    ocated
    0.97
    udes
    0.96
    uring
    0.93
    ocation
    0.93
    Act Density 0.074%

    No Known Activations