INDEX
    Explanations

    phrases or questions expressing uncertainty or curiosity about a situation

    New Auto-Interp
    Negative Logits
     hoffe
    -0.70
    jaus
    -0.64
    OfYear
    -0.63
    føl
    -0.62
     mostrarse
    -0.61
     niž
    -0.61
     vuitton
    -0.58
    Klass
    -0.58
    ptăm
    -0.57
     bevis
    -0.57
    POSITIVE LOGITS
    What
    0.99
     What
    0.98
     what
    0.96
    what
    0.96
    WHAT
    0.94
     WHAT
    0.93
    AndEndTag
    0.77
    GEBURTSDATUM
    0.76
    DECREF
    0.67
     OMITBAD
    0.66
    Act Density 0.129%

    No Known Activations