INDEX
    Explanations

    clarity or emphasis in statements

    phrases indicating clarity or making something clear

    New Auto-Interp
    Negative Logits
    inse
    -0.74
    umbn
    -0.68
    umat
    -0.66
    asus
    -0.65
    ernels
    -0.65
    unte
    -0.64
    chance
    -0.64
    hovah
    -0.63
     Rou
    -0.63
    olen
    -0.63
    POSITIVE LOGITS
    ances
    0.91
     distinctions
    0.89
     outlines
    0.83
    iary
    0.81
     distinction
    0.79
    forth
    0.74
     sailing
    0.74
     clear
    0.71
     contrasts
    0.68
     clearer
    0.68
    Act Density 0.019%

    No Known Activations