INDEX
    Explanations

    variations of the word "you" and its contextual references

    New Auto-Interp
    Negative Logits
    µ
    -0.16
    oir
    -0.15
    on
    -0.15
    antly
    -0.14
    ader
    -0.14
     Besch
    -0.14
     gratuit
    -0.13
    frei
    -0.13
    itory
    -0.13
     autop
    -0.13
    POSITIVE LOGITS
    dsn
    0.17
    ÄįenÃŃ
    0.16
    ulla
    0.15
    uali
    0.15
    ìļ¸
    0.15
    olan
    0.15
    acity
    0.15
    arter
    0.14
    lete
    0.14
    ELLOW
    0.14
    Act Density 0.026%

    No Known Activations