INDEX
    Explanations

    the term "mean" in various contexts

    New Auto-Interp
    Negative Logits
    cha
    -1.78
    yn
    -1.69
    hed
    -1.69
    ub
    -1.60
    ]{.
    -1.53
    htra
    -1.51
    iers
    -1.48
    hn
    -1.48
    encies
    -1.45
    bsd
    -1.43
    POSITIVE LOGITS
    lights
    1.64
     GMT
    1.58
     identical
    1.53
     breath
    1.48
     suit
    1.46
     accompanies
    1.46
     glare
    1.45
     photographs
    1.45
     coma
    1.43
     watches
    1.42
    Act Density 0.011%

    No Known Activations