INDEX
    Explanations

    the word "explain" and possibly words related to scientific explanations or theories.

    explanations or theories

    phrases related to providing explanations

    New Auto-Interp
    Negative Logits
    engeance
    -0.83
    nown
    -0.79
    illet
    -0.77
    inal
    -0.76
    jet
    -0.76
    thritis
    -0.74
    nir
    -0.72
    naissance
    -0.72
    ascus
    -0.71
    atri
    -0.70
    POSITIVE LOGITS
     why
    1.15
     WHY
    1.10
     explanations
    0.98
    ĸļ
    0.93
     explan
    0.93
     explain
    0.93
     Explain
    0.92
    why
    0.92
     explains
    0.87
    urated
    0.86
    Act Density 0.020%

    No Known Activations