INDEX
    Explanations

    occurrences of the word "more."

    New Auto-Interp
    Negative Logits
    Pist
    -0.63
    Vict
    -0.63
     Vict
    -0.61
    leaflet
    -0.57
    Jacob
    -0.57
     Jacob
    -0.56
    paddingLeft
    -0.56
    Cristian
    -0.56
     Synt
    -0.56
     Kn
    -0.56
    POSITIVE LOGITS
    more
    2.39
    MORE
    1.92
    More
    1.55
    mores
    1.38
     MORE
    1.38
     More
    1.31
     more
    1.17
    emore
    1.15
    omore
    1.05
    更多
    1.00
    Act Density 0.006%

    No Known Activations