INDEX
    Explanations

    high-frequency function words and prepositions often found in logical or structured arguments

    New Auto-Interp
    Negative Logits
    ëĿ½
    -0.16
    imson
    -0.16
    enchmark
    -0.15
     Werner
    -0.15
    791
    -0.14
     cub
    -0.14
    870
    -0.14
     cont
    -0.14
    uchs
    -0.14
    478
    -0.14
    POSITIVE LOGITS
    ippi
    0.18
    ipple
    0.16
    ez
    0.16
    arc
    0.15
    rin
    0.15
    uelles
    0.15
    geb
    0.14
    arbon
    0.14
    oki
    0.14
    izio
    0.14
    Act Density 0.001%

    No Known Activations