INDEX
Explanations
phrases indicating causality or outcomes
New Auto-Interp
Negative Logits
pherd
-0.72
teach
-0.69
ToUse
-0.69
abusing
-0.68
DotNetBar
-0.67
Dak
-0.66
expandable
-0.66
hates
-0.63
ppery
-0.63
cometer
-0.63
POSITIVE LOGITS
result
1.80
resulted
1.67
results
1.59
resulting
1.57
Result
1.56
RESULT
1.52
result
1.49
Result
1.38
résult
1.37
RESULT
1.37
Activations Density 0.136%