INDEX
    Explanations

    comparative phrases or expressions indicating similarity or likeness

    New Auto-Interp
    Negative Logits
    icators
    -0.86
    ocamp
    -0.81
    icator
    -0.81
    iencies
    -0.76
    ourse
    -0.76
    ixel
    -0.76
    ribution
    -0.74
    Published
    -0.73
    rity
    -0.73
    ilic
    -0.73
    POSITIVE LOGITS
    liest
    1.11
    lier
    1.05
    lihood
    0.90
     comparing
    0.77
     waking
    0.75
     spitting
    0.74
     heaven
    0.72
     crazy
    0.70
     forgetting
    0.68
     remembering
    0.67
    Act Density 0.023%

    No Known Activations