INDEX
Explanations
phrases indicating intention or purpose
mentions of the word "mind" and the concept of consideration
New Auto-Interp
Negative Logits
heres
-0.68
iott
-0.66
nect
-0.64
recess
-0.63
hell
-0.60
Reviewed
-0.59
oqu
-0.59
phia
-0.58
mouth
-0.58
advert
-0.58
POSITIVE LOGITS
when
0.85
WHEN
0.71
oldown
0.64
rals
0.63
dstg
0.63
enance
0.63
whenever
0.61
aepernick
0.60
budgetary
0.58
perature
0.58
Activations Density 0.090%