INDEX
    Explanations

    phrases related to comparisons or contrasts

    New Auto-Interp
    Negative Logits
    dylib
    -0.78
     Coffin
    -0.68
     Femin
    -0.66
     Alban
    -0.65
    berra
    -0.64
    chieve
    -0.63
    achev
    -0.63
     Starr
    -0.62
     Narc
    -0.60
     Dempsey
    -0.60
    POSITIVE LOGITS
     shouldn
    0.84
    initely
    0.83
     naturally
    0.81
     definitely
    0.79
     hopefully
    0.78
     kinda
    0.77
     chances
    0.77
     externalToEVAOnly
    0.73
     accordingly
    0.72
     yeah
    0.71
    Act Density 0.588%

    No Known Activations