INDEX
    Explanations

    percentage comparisons in text

    phrases indicating comparisons or relationships between different items or groups

    New Auto-Interp
    Negative Logits
    gravity
    -0.79
    rog
    -0.77
    rak
    -0.72
    ger
    -0.71
     Defenders
    -0.70
    azaar
    -0.69
    spe
    -0.67
    talk
    -0.67
    gery
    -0.66
    wordpress
    -0.66
    POSITIVE LOGITS
     eleph
    0.83
     sexes
    0.77
     sidx
    0.77
     guiActiveUn
    0.77
     weep
    0.74
     proport
    0.73
     nomine
    0.71
     halves
    0.69
    nces
    0.69
    çͰ
    0.67
    Act Density 0.009%

    No Known Activations