INDEX
    Explanations

    phrases that assert beliefs or clarify facts

    New Auto-Interp
    Negative Logits
     ſtate
    -0.83
     pleaſure
    -0.81
     cuffs
    -0.78
     purpoſe
    -0.77
     ſame
    -0.76
     RSSSF
    -0.76
    Portály
    -0.75
     Yoh
    -0.75
     raiſ
    -0.75
     Weyl
    -0.75
    POSITIVE LOGITS
     being
    1.07
     also
    1.02
     is
    0.98
     be
    0.97
     simply
    0.84
     was
    0.84
     not
    0.83
     going
    0.82
     être
    0.80
     becoming
    0.78
    Act Density 0.321%

    No Known Activations