INDEX
    Explanations

    phrases containing the words "on the surface."

    phrases indicating surface-level observations or appearances

    New Auto-Interp
    Negative Logits
    ļéĨĴ
    -0.81
    arov
    -0.78
    ãĥ¥
    -0.78
    eed
    -0.77
    leground
    -0.73
    TN
    -0.70
    inar
    -0.69
    ufact
    -0.69
    staking
    -0.68
    arah
    -0.68
    POSITIVE LOGITS
     however
    1.08
     though
    0.82
     somew
    0.76
     there
    0.69
     psychologists
    0.68
     moreover
    0.66
     please
    0.66
     suffice
    0.65
     whoever
    0.63
    there
    0.63
    Act Density 0.274%

    No Known Activations