INDEX
Explanations
references to columns in data tables or structured data formats
New Auto-Interp
Negative Logits
ycl
-0.16
rim
-0.16
kola
-0.15
anvas
-0.15
lemen
-0.15
eldon
-0.15
slash
-0.15
sembl
-0.14
emin
-0.14
upt
-0.14
POSITIVE LOGITS
ar
0.29
arity
0.26
ists
0.23
aire
0.22
aires
0.21
wise
0.19
ophon
0.19
ISTS
0.18
heads
0.17
-wise
0.17
Activations Density 0.023%