INDEX
Explanations
instances of comparisons or conditions related to numbers and quantities
New Auto-Interp
Negative Logits
Second
-0.20
SECOND
-0.17
вÑĤоÑĢ
-0.17
اÙĦثاÙĨÙĬ
-0.17
第äºĮ
-0.17
第äºĮ
-0.16
Second
-0.16
Secondly
-0.16
SECOND
-0.16
second
-0.16
POSITIVE LOGITS
3
0.43
third
0.43
thirds
0.37
Third
0.36
third
0.35
ÑĤÑĢеÑĤ
0.34
-third
0.33
Third
0.32
THIRD
0.32
ï¼ĵ
0.32
Activations Density 0.128%