INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ',{'
    -0.11
     
    -0.09
    orce
    -0.09
    ohan
    -0.09
    abee
    -0.09
    į
    -0.09
    ungi
    -0.09
    :
    -0.08
    older
    -0.08
    .src
    -0.08
    POSITIVE LOGITS
     whom
    0.33
     mentioned
    0.25
    mentioned
    0.23
     menc
    0.21
     youre
    0.20
     referred
    0.20
     ÙħÙĪØ±Ø¯
    0.19
     Mention
    0.19
     mention
    0.19
     you
    0.17
    Act Density 0.360%

    No Known Activations