INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     구매
    -0.09
     buyer
    -0.09
     underline
    -0.08
     compras
    -0.08
     underscore
    -0.08
     blem
    -0.08
     strike
    -0.08
     olive
    -0.08
     purchases
    -0.07
     BUY
    -0.07
    POSITIVE LOGITS
     GPT
    0.12
    GPT
    0.12
    伦理
    0.10
     dangerously
    0.10
     perigos
    0.10
     epistem
    0.10
     dangers
    0.10
     cognitive
    0.10
     dangerous
    0.10
    危险
    0.10
    Act Density 0.020%

    No Known Activations