INDEX
Explanations
instances of decreasing quantities or values
New Auto-Interp
Negative Logits
”
-0.85
“
-0.82
‘
-0.79
“
-0.79
.”
-0.76
,”
-0.75
dür
-0.72
’
-0.72
!”
-0.70
(“
-0.70
POSITIVE LOGITS
decrease
1.37
Decrease
1.33
decreases
1.27
Decrease
1.27
decre
1.25
decrease
1.25
Decre
1.24
Decreased
1.21
Decre
1.20
decreased
1.18
Activations Density 0.182%