INDEX
Explanations
statements beginning with "Yes" and followed by affirming or agreeing words or phrases
affirmations or positive acknowledgments
New Auto-Interp
Negative Logits
-+-+
-0.69
perial
-0.65
inese
-0.65
rall
-0.62
isf
-0.62
Gleaming
-0.61
comprom
-0.61
kefeller
-0.60
actionDate
-0.57
tnc
-0.57
POSITIVE LOGITS
terday
1.63
hua
1.00
sir
0.94
indeed
0.89
,
0.86
TER
0.83
yes
0.80
!,
0.78
ter
0.77
hur
0.76
Activations Density 0.023%