INDEX
    Explanations

    phrases indicating the beginning or initiation of events or actions

    the word "what" in various contexts

    New Auto-Interp
    Negative Logits
     fixation
    -0.69
     fix
    -0.63
     fixing
    -0.61
     abiding
    -0.60
     inserting
    -0.59
     supporting
    -0.59
     concentrating
    -0.58
     running
    -0.57
     hearing
    -0.57
     receiving
    -0.57
    POSITIVE LOGITS
    soever
    1.08
     happens
    1.06
     happened
    1.04
     transpired
    1.03
     amounted
    1.02
     constitutes
    0.96
    wikipedia
    0.84
    Downloadha
    0.83
     appears
    0.82
     resembles
    0.82
    Act Density 0.080%

    No Known Activations