INDEX
    Explanations

    mentions of romance novels and related literature themes

    New Auto-Interp
    Negative Logits
    868
    -0.19
    rame
    -0.15
    rious
    -0.15
     Gram
    -0.15
     payday
    -0.14
    unken
    -0.14
    _KERNEL
    -0.14
    pter
    -0.14
    .TestTools
    -0.14
    865
    -0.14
    POSITIVE LOGITS
     Faul
    0.15
    OutOfBounds
    0.15
    adt
    0.15
    磨
    0.14
    phinx
    0.14
    akh
    0.14
    acas
    0.14
    audi
    0.14
    arden
    0.13
    -series
    0.13
    Act Density 0.159%

    No Known Activations