INDEX
Explanations
phrases related to knowledge, understanding, and awareness
statements of existence or being
New Auto-Interp
Negative Logits
itches
-0.70
Footnote
-0.69
Hitch
-0.66
verts
-0.66
DOI
-0.63
Moines
-0.62
Dickinson
-0.61
rites
-0.61
erald
-0.60
refrain
-0.60
POSITIVE LOGITS
happening
1.18
meant
0.92
supposed
0.90
going
0.89
gonna
0.88
nt
0.87
occurring
0.86
wrong
0.86
done
0.84
presently
0.82
Activations Density 0.090%