INDEX
    Explanations

    repeated phrases or patterns in the text

    New Auto-Interp
    Negative Logits
     pim
    -0.64
     pound
    -0.60
     envy
    -0.58
     arming
    -0.57
     corridors
    -0.57
     latch
    -0.56
    Spoiler
    -0.56
     herald
    -0.55
     ribbon
    -0.54
     establishing
    -0.54
    POSITIVE LOGITS
    enei
    0.99
    oslav
    0.91
    pillar
    0.88
    haar
    0.82
    iq
    0.82
    heid
    0.79
    hiro
    0.79
    heng
    0.79
    arel
    0.79
    bah
    0.77
    Act Density 3.258%

    No Known Activations