INDEX
    Explanations

    adjectives conveying a strong and clear contrast

    the word "stark" and its variations in contexts highlighting contrasts or extremes

    New Auto-Interp
    Negative Logits
    uthor
    -0.72
    annis
    -0.71
    uters
    -0.70
    ipop
    -0.70
     diligently
    -0.68
    phis
    -0.67
    PU
    -0.67
     safely
    -0.66
    hops
    -0.65
     hemor
    -0.64
    POSITIVE LOGITS
     contrasts
    1.12
    ly
    1.04
     contrast
    1.03
     stark
    0.88
     naked
    0.82
    est
    0.81
    iary
    0.76
     contrasting
    0.75
     difference
    0.74
    er
    0.74
    Act Density 0.013%

    No Known Activations