INDEX
    Explanations

    phrases indicating physical positioning or hierarchy

    references to the concept of "beneath" or "underneath."

    New Auto-Interp
    Negative Logits
    yah
    -0.87
    ordan
    -0.78
    eln
    -0.78
    atic
    -0.70
    ern
    -0.67
    yrim
    -0.67
     Preferred
    -0.67
    andowski
    -0.67
    ebus
    -0.65
    agne
    -0.65
    POSITIVE LOGITS
    neath
    1.09
    eatures
    1.07
    ĸļ
    0.98
    pins
    0.96
     beneath
    0.90
    ĨĴ
    0.86
     tremend
    0.83
    ¥ŀ
    0.81
     layers
    0.81
     underneath
    0.80
    Act Density 0.014%

    No Known Activations