INDEX
    Explanations

    words related to location or spatial relationships

    New Auto-Interp
    Negative Logits
    UFF
    -0.16
    lessly
    -0.15
    pton
    -0.15
    ide
    -0.15
    elon
    -0.14
    usable
    -0.14
     estates
    -0.14
    illas
    -0.14
    estate
    -0.14
    spiel
    -0.14
    POSITIVE LOGITS
    lie
    0.19
    neath
    0.18
     ausp
    0.16
    Invariant
    0.16
    á»ı
    0.15
    weg
    0.15
     supervision
    0.15
    werp
    0.15
     ander
    0.15
    ancode
    0.14
    Act Density 0.008%

    No Known Activations