INDEX
    Explanations

    comparative language and indicators of difference

    New Auto-Interp
    Negative Logits
     I
    -0.99
    -0.95
     you
    -0.94
     guys
    -0.79
     is
    -0.76
    /
    -0.76
    .
    -0.75
      
    -0.75
     it
    -0.74
     we
    -0.71
    POSITIVE LOGITS
     ―――――
    1.23
     ſind
    1.23
     ་་
    1.23
     itſelf
    1.09
     auffi
    1.06
     doubtnut
    1.05
     eſſ
    1.05
    ſelf
    1.05
     quæ
    1.04
     iſt
    1.03
    Act Density 7.663%

    No Known Activations