INDEX
Explanations
references to linear concepts, particularly in mathematical contexts
New Auto-Interp
Negative Logits
nonlinear
-0.20
amax
-0.19
ENTE
-0.18
turnstile
-0.17
ennis
-0.16
ente
-0.16
engers
-0.16
anela
-0.16
yre
-0.15
LENG
-0.15
POSITIVE LOGITS
ly
0.37
ized
0.31
ization
0.27
izing
0.26
izable
0.25
ities
0.24
ize
0.23
ised
0.23
/angular
0.21
izes
0.20
Activations Density 0.012%