INDEX
Negative Logits
WND
-0.08
Thi
-0.08
еза
-0.08
entertained
-0.07
Debt
-0.07
hurried
-0.07
extraction
-0.07
hitro
-0.07
Competitive
-0.07
incess
-0.07
POSITIVE LOGITS
known
0.11
_known
0.10
known
0.10
calibration
0.09
conocido
0.09
Known
0.09
benchmark
0.09
conocidas
0.09
Known
0.09
ज्ञ
0.09
Activations Density 0.011%