INDEX
    Explanations

    instances of the word "what," indicating a focus on questions or inquiries

    New Auto-Interp
    Negative Logits
     Aze
    -0.76
     brook
    -0.74
     Verge
    -0.72
     castor
    -0.71
     Moors
    -0.70
    Brooks
    -0.68
     Lucerne
    -0.67
     Geor
    -0.66
     BROOK
    -0.66
    timbangkan
    -0.66
    POSITIVE LOGITS
     what
    1.81
     WHAT
    1.70
    what
    1.67
    WHAT
    1.65
     What
    1.65
    What
    1.65
     wat
    0.96
    whats
    0.95
    Τι
    0.94
     quelles
    0.93
    Act Density 0.088%

    No Known Activations