INDEX
    Explanations

    instances of the word "you" and its various forms

    New Auto-Interp
    Negative Logits
    WER
    -0.17
    ocht
    -0.15
    æĦı
    -0.15
    nees
    -0.15
    ÑĤим
    -0.14
    ISTR
    -0.14
     Alive
    -0.13
    RIPT
    -0.13
    orry
    -0.13
    -cols
    -0.13
    POSITIVE LOGITS
     Econ
    0.17
    /e
    0.14
    èĻ
    0.14
    emiz
    0.14
    ured
    0.14
    arily
    0.14
    vit
    0.14
    Ļ
    0.14
    оÑĢов
    0.14
    arrera
    0.14
    Act Density 0.035%

    No Known Activations