INDEX
Explanations
references to hierarchical levels or classifications
New Auto-Interp
Negative Logits
efe
-0.18
oggled
-0.16
sse
-0.15
sj
-0.15
srv
-0.14
ismic
-0.14
sen
-0.14
eph
-0.14
enticate
-0.14
sko
-0.14
POSITIVE LOGITS
most
0.43
ech
0.29
reaches
0.27
cased
0.27
archy
0.27
MOST
0.27
-middle
0.26
class
0.25
-tier
0.25
case
0.23
Activations Density 0.034%