INDEX
Explanations
the presence of the word "help"
the phrase "can't help" in various contexts
New Auto-Interp
Negative Logits
theless
-0.66
etheus
-0.64
andom
-0.59
avior
-0.58
sonian
-0.58
naire
-0.56
punk
-0.54
esome
-0.54
initely
-0.54
ortality
-0.54
POSITIVE LOGITS
ctor
0.66
noticing
0.64
ãĤ®
0.61
":["
0.60
Sega
0.59
des
0.58
ãĤ¨ãĥ«
0.58
fielding
0.56
#$#$
0.55
sponsoring
0.55
Activations Density 0.033%