INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ÑģоÑĤ
    -0.07
    jl
    -0.07
    oyal
    -0.06
    offer
    -0.06
     rig
    -0.06
    john
    -0.06
    lasses
    -0.06
    ailles
    -0.06
    edly
    -0.06
     Farr
    -0.06
    POSITIVE LOGITS
    croft
    0.10
    front
    0.08
    es
    0.08
    ouse
    0.08
    side
    0.08
    bum
    0.07
    y
    0.07
     Colony
    0.07
    OUSE
    0.07
    Front
    0.07
    Act Density 0.005%

    No Known Activations