INDEX
Explanations
television program titles and associated details
New Auto-Interp
Negative Logits
endor
-0.20
asel
-0.19
bose
-0.17
маз
-0.16
trusted
-0.15
allel
-0.15
orest
-0.15
ANTE
-0.14
/interface
-0.14
rosso
-0.14
POSITIVE LOGITS
Eff
0.17
iku
0.16
Davies
0.15
Lever
0.15
530
0.15
Attempts
0.14
il
0.14
904
0.14
Pierce
0.14
516
0.14
Activations Density 0.034%