INDEX
Explanations
Introduces explanation or consequence
New Auto-Interp
Negative Logits
cluding
0.40
YPE
0.40
Including
0.40
ensee
0.37
including
0.36
direct
0.36
chon
0.35
arrang
0.35
incluindo
0.34
fluct
0.33
POSITIVE LOGITS
entails
1.06
involves
0.98
necessitates
0.91
entail
0.89
requires
0.86
требует
0.83
innebär
0.83
necessitate
0.82
requires
0.81
entailed
0.81
Activations Density 0.004%