INDEX
Explanations
instances of positive or reinforcing expressions
New Auto-Interp
Negative Logits
initComponents
-0.77
ker
-0.72
kira
-0.67
</i>
-0.67
Hawks
-0.66
kas
-0.65
Sark
-0.65
Lindsay
-0.65
cas
-0.64
.
-0.63
POSITIVE LOGITS
&+
1.54
>+</
1.51
}+
1.38
+
1.37
()+
1.35
)+
1.32
_+
1.30
$+
1.29
%+
1.28
+
1.27
Activations Density 0.453%