INDEX
    Explanations

    Negation/denial sentences

    New Auto-Interp
    Negative Logits
    」的
    -0.07
     ابتد
    -0.07
    ”的
    -0.07
     conceded
    -0.07
    前に
    -0.06
    iterations
    -0.06
     above
    -0.06
    OLVE
    -0.06
    ')↵
    -0.06
     ushort
    -0.06
    POSITIVE LOGITS
    utely
    0.08
    -<?
    0.07
    groundColor
    0.06
     J
    0.06
     Responsible
    0.06
    /archive
    0.06
    	internal
    0.06
    unctuation
    0.06
    ,private
    0.06
     mData
    0.06
    Act Density 0.012%

    No Known Activations