INDEX
    Explanations

    phrases related to asking questions or seeking information

    references to the concept of "what" in various contexts

    New Auto-Interp
    Negative Logits
    enburg
    -0.84
    robe
    -0.73
    gur
    -0.73
    gi
    -0.68
    xon
    -0.67
    kj
    -0.66
    odge
    -0.60
    uge
    -0.60
    arella
    -0.60
     favor
    -0.58
    POSITIVE LOGITS
     transpired
    1.39
     happens
    1.34
     happened
    1.30
     constitutes
    1.21
    soever
    1.09
     awaits
    0.99
     separates
    0.97
     happ
    0.97
     exactly
    0.96
     constituted
    0.94
    Act Density 0.107%

    No Known Activations