INDEX
Explanations
instances of the word "mostly"
phrases that indicate the frequency or predominance of an element or concept
New Auto-Interp
Negative Logits
anth
-0.77
hawk
-0.69
Express
-0.68
uers
-0.67
Reviewer
-0.67
atron
-0.66
ciation
-0.66
orf
-0.65
ilion
-0.64
ENA
-0.64
POSITIVE LOGITS
consisted
0.93
ceremonial
0.86
consisting
0.84
unnoticed
0.83
unchanged
0.82
consist
0.82
consists
0.81
unexpl
0.81
overlooked
0.79
focused
0.78
Activations Density 0.053%