INDEX
    Explanations

    phrases and expressions that emphasize repetition or enumeration of items

    New Auto-Interp
    Negative Logits
    å®Ŀ
    -0.14
    iards
    -0.14
    539
    -0.14
    dea
    -0.13
     beloved
    -0.13
    pt
    -0.13
     whereabouts
    -0.13
    necessary
    -0.13
    ť
    -0.13
    ovic
    -0.13
    POSITIVE LOGITS
     thing
    0.42
     reason
    0.36
     problem
    0.34
     Thing
    0.32
     funny
    0.31
    Thing
    0.30
     interesting
    0.29
    thing
    0.28
     Problem
    0.27
     beauty
    0.27
    Act Density 0.367%

    No Known Activations