INDEX
    Explanations

    instances of comparisons or contrasts in the text

    New Auto-Interp
    Negative Logits
     Mountain
    -0.15
    ads
    -0.15
    imitives
    -0.14
    ú
    -0.14
    anki
    -0.14
    gi
    -0.14
     Freder
    -0.13
     Wings
    -0.13
    936
    -0.13
    ama
    -0.13
    POSITIVE LOGITS
    Helmet
    0.15
    ê»ĺ
    0.15
    nodoc
    0.14
    beck
    0.14
    客
    0.14
    tron
    0.13
    ullet
    0.13
     AVR
    0.13
    Sac
    0.13
     cf
    0.13
    Act Density 0.110%

    No Known Activations