INDEX
    Explanations

    phrases that indicate quantities, preferences, and personal references

    New Auto-Interp
    Negative Logits
    pek
    -0.15
    leta
    -0.14
    bish
    -0.14
    IFICATIONS
    -0.13
    ipel
    -0.13
    ffen
    -0.13
    oner
    -0.13
    Âį
    -0.13
    isay
    -0.13
    xfb
    -0.13
    POSITIVE LOGITS
     specific
    1.03
    specific
    0.90
     particular
    0.83
     Specific
    0.82
    Specific
    0.82
     especÃŃf
    0.81
    -specific
    0.80
    _specific
    0.76
     specifically
    0.73
    pecific
    0.73
    Act Density 0.365%

    No Known Activations