INDEX
    Explanations

    the word "How" at the beginning of sentences

    New Auto-Interp
    Negative Logits
    goers
    -0.64
    ptions
    -0.62
    ultimate
    -0.62
     Feld
    -0.61
    outer
    -0.58
    article
    -0.57
     hereafter
    -0.56
    piece
    -0.56
    room
    -0.56
     Roller
    -0.55
    POSITIVE LOGITS
    soever
    1.17
    ever
    1.13
    ells
    1.04
    beit
    1.03
    ling
    0.93
    itzer
    0.87
    dy
    0.83
    ls
    0.82
     much
    0.79
    leep
    0.79
    Act Density 0.063%

    No Known Activations