INDEX
Explanations
punctuation marks and symbols that indicate emphasis or separation in text
New Auto-Interp
Negative Logits
iple
-0.69
ãĤ©
-0.64
©¶æ
-0.64
NX
-0.63
-0.60
iler
-0.59
orts
-0.58
usha
-0.58
Moines
-0.58
nood
-0.58
POSITIVE LOGITS
particularly
1.34
especially
1.34
something
1.25
including
1.25
namely
1.19
even
1.08
thus
1.03
often
1.00
provided
1.00
which
0.97
Activations Density 0.106%