INDEX
Explanations
thresholds or limits
references to various thresholds in different contexts
New Auto-Interp
Negative Logits
ateur
-0.78
obbies
-0.71
oir
-0.69
adoes
-0.68
ulf
-0.66
ocate
-0.66
arus
-0.65
irit
-0.62
enza
-0.62
rongh
-0.61
POSITIVE LOGITS
threshold
0.95
thresholds
0.93
clearance
0.88
cutoff
0.86
posts
0.85
criteria
0.77
=$
0.77
hurdle
0.77
requirement
0.77
stones
0.75
Activations Density 0.058%