INDEX
    Explanations

    mentions of the word "sophisticated."

    New Auto-Interp
    Negative Logits
    AZY
    -0.18
    nest
    -0.14
    hot
    -0.14
     Amir
    -0.14
    érica
    -0.14
     Vor
    -0.14
    .performance
    -0.14
    istance
    -0.14
    ÏįÏĢ
    -0.14
     Tex
    -0.13
    POSITIVE LOGITS
    pios
    0.15
    ÏĦεÏģ
    0.15
    -hooks
    0.15
     ç©
    0.14
    unc
    0.14
    íĴ
    0.14
    elles
    0.14
    HC
    0.14
    .atom
    0.14
    éĤ¦
    0.14
    Act Density 0.003%

    No Known Activations