INDEX
    Explanations

    phrases indicating the concept of reasonableness or fairness

    New Auto-Interp
    Negative Logits
    icket
    -0.73
    chet
    -0.70
    chu
    -0.70
    planes
    -0.69
    yi
    -0.69
    frey
    -0.69
    berries
    -0.68
    stals
    -0.65
    cart
    -0.64
    flower
    -0.63
    POSITIVE LOGITS
    tarian
    0.98
    acies
    0.82
     inference
    0.76
     excuse
    0.74
     precaution
    0.72
     expectation
    0.71
    soType
    0.70
    ufact
    0.70
     justification
    0.70
    itably
    0.70
    Act Density 0.959%

    No Known Activations