INDEX
    Explanations

    phrases related to conflicts of interest in various contexts

    New Auto-Interp
    Negative Logits
    adar
    -0.17
    zimmer
    -0.16
    vette
    -0.15
    ivre
    -0.15
    lettes
    -0.15
    _WM
    -0.15
    lsa
    -0.14
    ELY
    -0.14
    ktion
    -0.14
    APPER
    -0.14
    POSITIVE LOGITS
    703
    0.17
    killer
    0.15
    agon
    0.15
    327
    0.15
    atas
    0.15
    MC
    0.14
     o
    0.14
    MEM
    0.14
    399
    0.14
     astr
    0.14
    Act Density 0.084%

    No Known Activations