INDEX
    Explanations

    phrases indicating feelings, capabilities, and comparisons about quality and improvement

    New Auto-Interp
    Negative Logits
     larger
    -0.23
     longer
    -0.21
     Longer
    -0.20
     narrower
    -0.20
     Larger
    -0.20
     heavier
    -0.20
     bigger
    -0.19
     harder
    -0.18
     smaller
    -0.17
     wider
    -0.17
    POSITIVE LOGITS
     bet
    0.40
     BET
    0.38
     infinitely
    0.31
     better
    0.30
     Bett
    0.29
     bets
    0.28
     batter
    0.28
     MUCH
    0.28
     WAY
    0.27
     Much
    0.27
    Act Density 0.160%

    No Known Activations