INDEX
Explanations
numerical data or specific measurements
New Auto-Interp
Negative Logits
ContentAlignment
-0.19
tics
-0.16
tic
-0.15
abis
-0.15
esses
-0.15
ETCH
-0.14
otre
-0.14
atics
-0.14
extra
-0.14
_extra
-0.14
POSITIVE LOGITS
d
0.15
Chall
0.14
ónico
0.13
ONTAL
0.13
.openg
0.13
Ones
0.13
panor
0.13
شار
0.13
him
0.13
än
0.13
Activations Density 0.078%