INDEX
    Explanations

    assertions or claims of correctness or accuracy

    New Auto-Interp
    Negative Logits
     Flavoring
    -0.76
     Pastebin
    -0.75
     scrim
    -0.74
    gins
    -0.62
    hens
    -0.61
     convol
    -0.61
    heed
    -0.61
     Gong
    -0.60
     Cth
    -0.59
     Bund
    -0.59
    POSITIVE LOGITS
    headed
    0.82
    utherford
    0.77
    footed
    0.76
    terday
    0.69
    eyed
    0.69
    Bir
    0.67
    aez
    0.67
     insofar
    0.67
     Osw
    0.66
     about
    0.64
    Act Density 0.049%

    No Known Activations