INDEX
Explanations
assertions or statements regarding answers or solutions to various questions
New Auto-Interp
Negative Logits
¥µ
-0.64
activities
-0.63
ivities
-0.63
idi
-0.62
interstitial
-0.62
Orient
-0.62
rongh
-0.60
angering
-0.60
dealings
-0.60
joints
-0.58
POSITIVE LOGITS
YES
1.14
yes
1.13
YES
1.06
yes
0.98
nil
0.87
Nope
0.87
affirmative
0.85
brainer
0.80
Yes
0.78
unequiv
0.76
Activations Density 0.051%