INDEX
Explanations
numbers, particularly those indicating a specific quantity or range
numerical data, particularly ranges and quantities
New Auto-Interp
Negative Logits
conduc
-0.65
unnecess
-0.60
correctly
-0.57
alike
-0.55
whatsoever
-0.54
notor
-0.53
unfocusedRange
-0.53
therap
-0.53
Reviewer
-0.53
behavi
-0.51
POSITIVE LOGITS
]-
0.79
%-
0.79
-$
0.72
and
0.71
gee
0.71
ties
0.71
endor
0.64
brow
0.62
]"
0.62
and
0.62
Activations Density 0.084%