INDEX
    Explanations

    phrases indicating contrast or comparison

    phrases indicating distance or separation from a condition or state

    New Auto-Interp
    Negative Logits
    andowski
    -0.72
    tex
    -0.70
    liction
    -0.70
     contrace
    -0.69
    çī
    -0.69
    ilage
    -0.69
    rongh
    -0.68
     princ
    -0.68
    FY
    -0.68
     Defin
    -0.67
    POSITIVE LOGITS
     being
    0.81
     anywhere
    0.76
     ideal
    0.75
     conclusive
    0.74
     anything
    0.72
     satisfactory
    0.69
     perfect
    0.66
     exhaustive
    0.66
     resembling
    0.65
     optimal
    0.64
    Act Density 0.036%

    No Known Activations