INDEX
    Explanations

    angle, direction, orientation

    New Auto-Interp
    Negative Logits
     disproportion
    -0.10
     plaster
    -0.10
    agas
    -0.09
     Distance
    -0.09
    uct
    -0.09
     Shan
    -0.09
     Wass
    -0.09
     Lantern
    -0.08
    chio
    -0.08
     Luther
    -0.08
    POSITIVE LOGITS
     polar
    0.29
     Polar
    0.24
     polarization
    0.22
     filter
    0.19
     полÑı
    0.18
     filters
    0.17
     dep
    0.16
     Filter
    0.15
     Brew
    0.15
    åģı
    0.15
    Act Density 0.016%

    No Known Activations