INDEX
Explanations
phrases indicating surprising or unexpected outcomes
New Auto-Interp
Negative Logits
SEDS
-0.50
beginnetje
-0.50
ⓧ
-0.50
XmlAccessType
-0.49
WithIOException
-0.48
Wikimedijinoj
-0.47
msgTypes
-0.46
RegressionTest
-0.46
deca
-0.46
Destroyer
-0.46
POSITIVE LOGITS
Apparently
0.71
Apparently
0.71
apparently
0.66
Ternyata
0.65
apparently
0.58
ternyata
0.56
оказалось
0.50
okaza
0.49
blijkt
0.48
どうやら
0.47
Activations Density 0.285%