INDEX
Explanations
phrases indicating outcomes or consequences
New Auto-Interp
Negative Logits
borg
-0.16
angen
-0.15
awan
-0.15
odied
-0.14
ovich
-0.14
isini
-0.14
uden
-0.14
keterangan
-0.14
ods
-0.14
adel
-0.14
POSITIVE LOGITS
result
0.87
result
0.69
Result
0.69
-result
0.63
.result
0.63
Result
0.63
consequence
0.63
resultado
0.62
(result
0.61
RESULT
0.61
Activations Density 0.316%