INDEX
    Explanations

    occurrences of the word "love" in various contexts

    New Auto-Interp
    Negative Logits
    lage
    -0.15
    oot
    -0.15
    unner
    -0.15
    l
    -0.15
    .scalablytyped
    -0.14
    ootball
    -0.14
    λαν
    -0.14
    ural
    -0.14
    osy
    -0.13
    um
    -0.13
    POSITIVE LOGITS
     affair
    0.15
    fully
    0.15
    -kind
    0.14
    amentals
    0.14
    enci
    0.14
    full
    0.14
     arms
    0.14
    joy
    0.13
    ably
    0.13
    be
    0.13
    Act Density 0.046%

    No Known Activations