INDEX
    Explanations

    references to different types or categories of things

    New Auto-Interp
    Negative Logits
     it
    -0.64
    ?
    -0.54
     precondition
    -0.52
     cardiaque
    -0.52
    If
    -0.51
     giu
    -0.51
     that
    -0.51
    нина
    -0.51
    قر
    -0.49
    -0.49
    POSITIVE LOGITS
     sundry
    1.26
     various
    1.26
    various
    1.24
     kinds
    1.22
     assorted
    1.18
     nahilalakip
    1.18
    Various
    1.18
     Various
    1.15
     verschiedener
    1.14
     sorts
    1.12
    Act Density 0.076%

    No Known Activations