INDEX
    Explanations

    negations or contrasts in statements

    Following the word "not"

    New Auto-Interp
    Negative Logits
     ویکی‌پدیا
    -0.68
    oredCriteria
    -0.59
    writeFieldEnd
    -0.57
    tagHelperRunner
    -0.56
    ArrowToggle
    -0.54
     surla
    -0.54
    WriteBarrier
    -0.54
    CppMethod
    -0.54
    SBATCH
    -0.52
     GoogleFonts
    -0.52
    POSITIVE LOGITS
     Instead
    0.54
     instead
    0.53
    Instead
    0.49
     merely
    0.47
    ではなく
    0.44
     замі
    0.42
    instead
    0.42
     only
    0.42
     아니라
    0.42
     traditional
    0.40
    Act Density 0.272%

    No Known Activations