INDEX
    Explanations

    phrases indicating similarity or sameness

    repeated references to the concept of sameness

    New Auto-Interp
    Negative Logits
    ases
    -0.80
    *=-
    -0.76
    åĪ
    -0.71
    uria
    -0.67
    gets
    -0.67
    rosso
    -0.67
    airs
    -0.66
    HI
    -0.65
    rection
    -0.65
     Provided
    -0.65
    POSITIVE LOGITS
     thing
    0.96
     exact
    0.86
     ol
    0.76
     amount
    0.71
     ballpark
    0.71
    iating
    0.70
     everywhere
    0.70
    worldly
    0.69
     kind
    0.67
     playbook
    0.67
    Act Density 0.036%

    No Known Activations