INDEX
Explanations
phrases related to actions or accomplishments
instances of statements that assert or affirm a conclusion or condition
New Auto-Interp
Negative Logits
ascript
-0.74
ieval
-0.70
igs
-0.69
arted
-0.68
dden
-0.67
BuyableInstoreAndOnline
-0.66
FG
-0.63
si
-0.63
imental
-0.60
xia
-0.59
POSITIVE LOGITS
cru
0.80
downright
0.75
damned
0.73
horm
0.73
damn
0.72
therein
0.72
predictably
0.71
darn
0.69
certainly
0.69
Solitaire
0.68
Activations Density 0.503%