INDEX
    Explanations

    direct statements and acknowledgements

    New Auto-Interp
    Negative Logits
    !!
    1.01
    0.97
    !!!!!
    0.91
    !!!
    0.91
    !...
    0.88
    !
    0.86
    !!!!
    0.86
     !
    0.84
    等等
    0.83
     !!
    0.82
    POSITIVE LOGITS
     sobering
    0.98
    oterapia
    0.98
     bienvenida
    0.96
     regrett
    0.95
    مني
    0.95
     welcome
    0.94
     preferable
    0.94
    0.91
     hardly
    0.90
     pretty
    0.88
    Act Density 0.057%

    No Known Activations