INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     composition
    -2.11
     Composition
    -2.11
    composition
    -2.08
    Composition
    -2.06
     composing
    -2.05
     composed
    -1.99
     compose
    -1.97
     COMPOSITION
    -1.94
     compositions
    -1.91
    composed
    -1.84
    POSITIVE LOGITS
    er
    0.67
     of
    0.65
    e
    0.59
    ed
    0.58
    s
    0.57
    es
    0.55
    o
    0.52
    ie
    0.51
    of
    0.51
    i
    0.49
    Act Density 0.085%

    No Known Activations