INDEX
    Explanations

    the similarity or sameness of words or concepts across different contexts

    New Auto-Interp
    Negative Logits
     Provided
    -0.85
    skirts
    -0.79
    emi
    -0.79
    bane
    -0.76
    zy
    -0.73
    acus
    -0.73
    xtap
    -0.73
    uckle
    -0.72
    itely
    -0.72
    *=-
    -0.72
    POSITIVE LOGITS
     thing
    1.03
     exact
    1.02
     amount
    1.01
     kind
    0.91
     kinds
    0.87
     basic
    0.87
     playbook
    0.85
     vein
    0.83
     principles
    0.83
     sort
    0.82
    Act Density 0.042%

    No Known Activations