INDEX
Explanations
architectural descriptors and measurements
New Auto-Interp
Negative Logits
test
-0.41
ser
-0.40
-0.40
ks
-0.40
(
-0.40
مص
-0.40
Y
-0.39
med
-0.39
[]
-0.39
ה
-0.38
POSITIVE LOGITS
AnchorStyles
1.20
itſelf
1.06
myſelf
0.99
'\\;'
0.99
CloseOperation
0.95
doubtnut
0.93
Efq
0.90
himſelf
0.89
iſt
0.89
ſeveral
0.89
Activations Density 0.970%