INDEX
    Explanations

    phrases emphasizing comparisons or similarities using the word "as."

    New Auto-Interp
    Negative Logits
     pleaſure
    -1.01
     ſta
    -1.00
     raiſ
    -0.99
     Houſe
    -0.98
     Conſ
    -0.98
     houſe
    -0.93
     Jefus
    -0.92
     ſever
    -0.91
     itſelf
    -0.90
     ſche
    -0.90
    POSITIVE LOGITS
     as
    1.55
     As
    1.32
     AS
    1.20
    As
    1.15
    readAs
    1.03
     a
    0.92
    as
    0.91
     an
    0.82
     part
    0.78
     ως
    0.77
    Act Density 1.773%

    No Known Activations