INDEX
    Explanations

    phrases that introduce or emphasize a point of emphasis or contrast

    dash patterns or interruptions in text formatting

    New Auto-Interp
    Negative Logits
     protective
    -0.70
    onut
    -0.68
     baking
    -0.68
    emy
    -0.67
     grave
    -0.67
    obar
    -0.67
     discern
    -0.65
     heart
    -0.65
     iceberg
    -0.64
     drain
    -0.64
    POSITIVE LOGITS
    lance
    0.92
    fuck
    0.90
    [[
    0.90
    )--
    0.85
    DOWN
    0.85
    FORE
    0.84
    NOW
    0.83
    ==
    0.83
    SOURCE
    0.83
    _-
    0.82
    Act Density 0.012%

    No Known Activations