INDEX
    Explanations

    instances where the text mentions equality or similarity

    phrases emphasizing equality and uniformity across different subjects or contexts

    New Auto-Interp
    Negative Logits
    å§«
    -0.86
     Provided
    -0.80
    alf
    -0.78
    aze
    -0.77
    skirts
    -0.74
    *=-
    -0.73
    anism
    -0.73
    zona
    -0.73
    better
    -0.73
    ason
    -0.72
    POSITIVE LOGITS
     thing
    1.18
     amount
    1.13
     kinds
    1.10
     kind
    1.08
     exact
    1.05
     sorts
    1.01
     sort
    0.99
     fate
    0.93
     basic
    0.92
     principles
    0.87
    Act Density 0.052%

    No Known Activations