INDEX
    Explanations

    mention of the word "simple" and its variations related to concepts and processes

    New Auto-Interp
    Negative Logits
     simply
    -0.22
     simplement
    -0.21
     Simply
    -0.20
    simple
    -0.19
    Simply
    -0.19
    Simple
    -0.18
    _simple
    -0.18
     einfach
    -0.18
     simpl
    -0.18
     simplified
    -0.17
    POSITIVE LOGITS
    ton
    0.43
    tons
    0.43
    xes
    0.35
    -minded
    0.33
    TON
    0.30
     minded
    0.29
    ctic
    0.27
     yet
    0.26
    /basic
    0.24
    /plain
    0.23
    Act Density 0.040%

    No Known Activations