INDEX
    Explanations

    assertive or positive statements about products or concepts

    New Auto-Interp
    Negative Logits
    itr
    -0.16
     Reliable
    -0.15
     unforgettable
    -0.14
    æħİ
    -0.14
    éré
    -0.14
    Capability
    -0.14
    abbo
    -0.14
    otu
    -0.14
     doub
    -0.14
    usercontent
    -0.14
    POSITIVE LOGITS
     interesting
    0.30
    interesting
    0.28
    Interesting
    0.27
     Interesting
    0.25
     particularly
    0.25
     especially
    0.25
     interess
    0.23
    particularly
    0.22
     fascinating
    0.22
    especially
    0.21
    Act Density 0.018%

    No Known Activations