INDEX
    Explanations

    phrases referring to specific groups of individuals and their actions or characteristics

    phrases that typically start with "some people" indicating opinions or behaviors of individuals

    New Auto-Interp
    Negative Logits
    =~
    -0.86
    ãĤ´
    -0.84
    enegger
    -0.80
     è£ıç
    -0.71
    ĸļ
    -0.71
     Anyone
    -0.69
    --------------------------------------------------------
    -0.69
    -+
    -0.68
    Anyone
    -0.67
    ————————
    -0.66
    POSITIVE LOGITS
    hops
    0.70
    rooms
    0.70
     downright
    0.69
     creep
    0.68
    hop
    0.66
     wiser
    0.66
     outright
    0.64
     worse
    0.63
     incorrectly
    0.63
    lim
    0.63
    Act Density 0.310%

    No Known Activations