INDEX
Explanations
numerical values preceded by the phrase "up to"
phrases related to limits or maximums
New Auto-Interp
Negative Logits
Dro
-0.68
Reviewer
-0.63
åº
-0.62
News
-0.61
Posted
-0.61
doesnt
-0.60
Tools
-0.60
Moving
-0.60
Leave
-0.59
Still
-0.59
POSITIVE LOGITS
150
1.05
200
1.01
300
0.97
100
0.97
500
0.95
400
0.95
80
0.94
3000
0.94
1500
0.94
50
0.94
Activations Density 0.043%