INDEX
    Explanations

    the word "this" followed by other words

    references to the document or post being discussed

    New Auto-Interp
    Negative Logits
    Ĭ±
    -0.88
    mates
    -0.76
    zees
    -0.75
    ع
    -0.71
    akuya
    -0.71
    planes
    -0.70
    Nazis
    -0.70
    Americans
    -0.69
    iors
    -0.68
    ا
    -0.67
    POSITIVE LOGITS
     article
    1.59
     blog
    1.51
     tutorial
    1.38
     essay
    1.31
     guide
    1.30
     FAQ
    1.26
     post
    1.26
     section
    1.25
     wiki
    1.21
     page
    1.17
    Act Density 0.176%

    No Known Activations