INDEX
    Explanations

    phrases related to direct comparisons or competitions

    references to physical confrontations or interactions

    New Auto-Interp
    Negative Logits
     Polk
    -0.69
     tremend
    -0.68
    otten
    -0.67
     Roose
    -0.66
    iom
    -0.66
    live
    -0.65
    anson
    -0.65
    gnu
    -0.64
    oras
    -0.63
    orks
    -0.62
    POSITIVE LOGITS
     GHz
    0.69
     conversations
    0.69
     bilingual
    0.68
     transsexual
    0.66
     interactions
    0.66
     ratio
    0.65
    ALSE
    0.64
     comparisons
    0.63
     Indonesian
    0.63
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    0.63
    Act Density 0.045%

    No Known Activations