INDEX
    Explanations

    sentences that provide a conclusion or emphasize a point

    New Auto-Interp
    Negative Logits
    stery
    -0.17
    ayne
    -0.14
     (“
    -0.14
    onn
    -0.14
    umble
    -0.14
    ops
    -0.14
    shed
    -0.14
    fp
    -0.13
     fur
    -0.13
    تÙĪÙĨ
    -0.13
    POSITIVE LOGITS
     "
    0.20
    ")(
    0.17
     "↵
    0.17
     "(
    0.15
     ";
    0.15
    .SizeType
    0.15
     "$
    0.14
    omo
    0.14
    icker
    0.14
    eli
    0.14
    Act Density 0.032%

    No Known Activations