INDEX
    Explanations

    negative statements or contradictions

    negative phrases and rejections of concepts or narratives

    New Auto-Interp
    Negative Logits
    itas
    -0.76
    kees
    -0.74
    Fs
    -0.73
    unders
    -0.68
    units
    -0.67
    })
    -0.67
    ãĤ¼ãĤ¦ãĤ¹
    -0.66
    WAY
    -0.66
    çļ
    -0.65
    anders
    -0.65
    POSITIVE LOGITS
     necessarily
    1.29
     merely
    1.08
     mere
    0.93
    withstanding
    0.91
     meant
    0.84
     flashy
    0.83
     uncommon
    0.79
     solely
    0.79
     rocket
    0.79
     kidding
    0.77
    Act Density 0.170%

    No Known Activations