INDEX
Explanations
phrases indicating importance or significance in context
New Auto-Interp
Negative Logits
DockStyle
-1.11
ModelExpression
-0.93
виправивши
-0.88
мәкал
-0.85
thingy
-0.79
שוליים
-0.78
جوايز
-0.75
חיצוניים
-0.73
tagHelperRunner
-0.72
principalColumn
-0.72
POSITIVE LOGITS
fact
0.71
lack
0.61
,
0.60
“
0.59
same
0.59
combination
0.57
is
0.56
result
0.54
success
0.53
combined
0.52
Activations Density 0.294%