INDEX
    Explanations

    comparisons and similarities in sentences

    New Auto-Interp
    Negative Logits
    SPONSORED
    -0.83
    Rated
    -0.78
    OPLE
    -0.77
    Els
    -0.76
    Ò
    -0.71
    DAQ
    -0.70
    bart
    -0.68
    ben
    -0.67
    ART
    -0.67
    Ö¼
    -0.67
    POSITIVE LOGITS
     disclaim
    0.78
     we
    0.74
     dismissing
    0.72
     anecdotal
    0.68
     acknowledging
    0.67
     these
    0.64
     you
    0.64
     spirits
    0.64
     there
    0.61
     blaming
    0.61
    Act Density 0.160%

    No Known Activations