INDEX
Explanations
assumptions and idealized models
New Auto-Interp
Negative Logits
اجر
0.48
सेंचुरी
0.47
tầm
0.47
运营
0.45
Supervision
0.43
Certification
0.43
ފައި
0.43
Stake
0.43
তাহাকে
0.42
╹
0.42
POSITIVE LOGITS
idealized
1.08
ideal
1.03
theory
1.00
classical
1.00
理想
0.95
ideal
0.95
assuming
0.93
assumes
0.93
teoria
0.89
이론
0.88
Activations Density 0.035%