INDEX
    Explanations

    instances of the word "are" in various contexts

    New Auto-Interp
    Negative Logits
     EVERY
    -0.16
     stuff
    -0.16
     anything
    -0.16
     Anything
    -0.15
     itself
    -0.15
    anything
    -0.15
    anda
    -0.14
     alles
    -0.14
    something
    -0.14
     gist
    -0.14
    POSITIVE LOGITS
     times
    0.24
     few
    0.23
     fewer
    0.23
     no
    0.22
     two
    0.21
     plenty
    0.20
     certain
    0.19
     some
    0.19
     always
    0.18
     several
    0.18
    Act Density 0.067%

    No Known Activations