INDEX
    Explanations

    occurrences of the word "in"

    New Auto-Interp
    Negative Logits
    owie
    -0.18
    flate
    -0.16
    lessly
    -0.16
    ify
    -0.15
    avers
    -0.15
    ctors
    -0.15
     Dana
    -0.14
    sted
    -0.14
    asted
    -0.13
    ξι
    -0.13
    POSITIVE LOGITS
     depth
    0.26
     Depth
    0.21
    _depth
    0.21
     house
    0.20
    -depth
    0.20
    depth
    0.20
    Depth
    0.19
     situ
    0.19
    house
    0.19
     jokes
    0.17
    Act Density 0.027%

    No Known Activations