INDEX
    Explanations

    mentions of things being the same or identical

    references to the concept of matching

    New Auto-Interp
    Negative Logits
    Ïī
    -0.70
    OLOG
    -0.70
    ********************************
    -0.69
    ase
    -0.67
    Daily
    -0.66
    ################################
    -0.65
    ember
    -0.65
    aug
    -0.65
    gom
    -0.64
    bra
    -0.64
    POSITIVE LOGITS
     matching
    1.10
     matched
    1.05
     matches
    0.95
    ees
    0.81
     match
    0.80
    inances
    0.75
     mism
    0.74
    paren
    0.73
    poons
    0.71
     pairs
    0.71
    Act Density 0.010%

    No Known Activations