INDEX
    Explanations

    phrases indicating emphasis or focus on particular subjects or ideas

    New Auto-Interp
    Negative Logits
    awtextra
    -0.60
    antren
    -0.60
    interopRequire
    -0.57
    رشف
    -0.55
     religieuses
    -0.52
    big
    -0.51
    出版年
    -0.51
    rachtet
    -0.50
     preocupes
    -0.48
     insuffisamment
    -0.47
    POSITIVE LOGITS
     very
    2.08
    very
    1.80
    Very
    1.72
     Very
    1.68
    VERY
    1.43
     VERY
    1.38
     próprio
    1.32
     própria
    1.30
     stesso
    1.22
     stessa
    1.21
    Act Density 0.277%

    No Known Activations