INDEX
    Explanations

    comparative phrases, especially those using "as."

    New Auto-Interp
    Negative Logits
     ſche
    -0.59
     Conſ
    -0.50
     ſtate
    -0.49
     eiffel
    -0.48
     pleaſure
    -0.47
     Aéroport
    -0.47
     Anſ
    -0.45
     juſ
    -0.45
     bandou
    -0.45
     houſe
    -0.44
    POSITIVE LOGITS
    sowie
    0.90
     serta
    0.79
     sowie
    0.74
     וכן
    0.71
     nonché
    0.71
     oraz
    0.70
     sekä
    0.69
    以及
    0.68
    <bos>
    0.68
     including
    0.66
    Act Density 0.014%

    No Known Activations