INDEX
Explanations
special characters or symbols that might indicate formatting or coding elements
New Auto-Interp
Negative Logits
ünk
-0.20
ük
-0.19
eko
-0.18
etz
-0.17
etc
-0.17
defgroup
-0.17
etus
-0.17
ets
-0.17
elian
-0.16
ofs
-0.16
POSITIVE LOGITS
ket
0.30
bb
0.29
ink
0.27
kk
0.27
kn
0.26
kb
0.26
it
0.25
ine
0.25
kre
0.25
ivel
0.24
Activations Density 0.004%