INDEX
Explanations
phrases that express justification or reasoning behind actions and decisions
New Auto-Interp
Negative Logits
scrapy
-0.72
jadx
-0.67
PYX
-0.66
رشف
-0.64
trataba
-0.63
erialization
-0.63
كومونز
-0.62
ferons
-0.62
fjspx
-0.61
وتسجيلات
-0.61
POSITIVE LOGITS
reason
1.69
reasons
1.61
reason
1.57
Reasons
1.53
Reason
1.53
Reason
1.51
Reasons
1.49
reasons
1.49
why
1.38
REASON
1.34
Activations Density 0.256%