INDEX
    Explanations

    numerical comparisons relating to equivalences

    comparisons that express equivalence

    New Auto-Interp
    Negative Logits
    stal
    -0.83
    stra
    -0.75
    bean
    -0.74
    oard
    -0.73
    cker
    -0.72
    oji
    -0.70
    beans
    -0.69
    oufl
    -0.69
    spe
    -0.69
    ker
    -0.68
    POSITIVE LOGITS
    lihood
    0.90
     oppos
    0.81
    ivalent
    0.81
     amounts
    0.78
    isons
    0.77
     sized
    0.76
     twins
    0.76
     equivalent
    0.75
    MpServer
    0.75
    oreal
    0.74
    Act Density 0.030%

    No Known Activations