INDEX
    Explanations

    the word "what" in various contexts

    New Auto-Interp
    Negative Logits
     Verge
    -0.70
     Crocodile
    -0.66
     Aze
    -0.66
     Moors
    -0.66
    ztály
    -0.64
     Jolie
    -0.64
     Castor
    -0.64
     суток
    -0.63
     Swartz
    -0.62
     BROOK
    -0.62
    POSITIVE LOGITS
     what
    1.87
    what
    1.74
     WHAT
    1.73
    WHAT
    1.69
    What
    1.64
     What
    1.60
     quelles
    1.02
    whats
    0.99
     quels
    0.94
     wat
    0.94
    Act Density 0.139%

    No Known Activations