INDEX
    Explanations

    the word "only" in various contexts

    New Auto-Interp
    Negative Logits
    åıªæĺ¯
    -0.19
    are
    -0.18
    lez
    -0.16
     ONLY
    -0.15
    zan
    -0.15
    -only
    -0.15
     only
    -0.15
    onders
    -0.15
     seulement
    -0.15
    еÑĢ
    -0.14
    POSITIVE LOGITS
    fans
    0.25
    Fans
    0.24
    íģ¼
    0.22
     rarely
    0.19
     baÅŁÄ±na
    0.19
     partially
    0.17
    yyy
    0.16
    yyyy
    0.16
    ahoma
    0.16
    ness
    0.16
    Act Density 0.091%

    No Known Activations