INDEX
    Explanations

    phrases mentioning places or locations

    references to the word "the."

    New Auto-Interp
    Negative Logits
     distinguishes
    -0.67
    packages
    -0.63
     compared
    -0.63
     alike
    -0.62
    uci
    -0.62
    iating
    -0.62
    iac
    -0.61
    /-
    -0.61
    pelling
    -0.59
    thood
    -0.58
    POSITIVE LOGITS
     forefront
    1.15
     depths
    1.07
     fray
    0.98
     sidelines
    0.98
     periphery
    0.97
     nearest
    0.96
     podium
    0.96
     fullest
    0.93
     outskirts
    0.89
     same
    0.88
    Act Density 0.184%

    No Known Activations