INDEX
    Explanations

    words related to the concept of "home" or "place."

    New Auto-Interp
    Negative Logits
    yb
    -0.19
    ivan
    -0.18
    edBy
    -0.17
    elist
    -0.17
    outh
    -0.16
    ÏĦολ
    -0.15
    лÑĸннÑı
    -0.15
    y
    -0.15
    yw
    -0.15
    assen
    -0.15
    POSITIVE LOGITS
    yssey
    0.33
    ding
    0.31
    ded
    0.29
    dy
    0.26
    ders
    0.26
    der
    0.22
    den
    0.21
    ermal
    0.21
    ges
    0.20
    nocenÃŃ
    0.20
    Act Density 0.038%

    No Known Activations