INDEX
Explanations
verbs related to providing explanations or reasons
instances of the word "explain" in various contexts
New Auto-Interp
Negative Logits
illet
-0.81
engeance
-0.80
estial
-0.73
thritis
-0.71
nown
-0.71
inal
-0.71
ortunately
-0.69
mire
-0.67
atri
-0.67
pione
-0.66
POSITIVE LOGITS
why
1.13
WHY
1.12
explain
1.06
why
1.01
explanations
1.01
explains
1.01
Explain
0.99
explan
0.96
explaining
0.93
cases
0.92
Activations Density 0.019%