INDEX
    Explanations

    phrases indicating a small amount or degree

    mentions of the word "little" in varying contexts

    New Auto-Interp
    Negative Logits
    eneg
    -0.91
    itivity
    -0.77
    ocity
    -0.77
    ources
    -0.74
    idents
    -0.73
    itures
    -0.73
    hester
    -0.73
    itars
    -0.70
    anwhile
    -0.70
    ovies
    -0.70
    POSITIVE LOGITS
     bit
    1.35
     peek
    0.86
     helper
    0.81
     glimpse
    0.80
     tad
    0.78
     BIT
    0.77
     girl
    0.75
     harmless
    0.74
     hitter
    0.72
     patience
    0.71
    Act Density 0.023%

    No Known Activations