INDEX
Explanations
color codes in hexadecimal format
New Auto-Interp
Negative Logits
fffffff
-0.17
ãĥ¼ãĤº
-0.16
zen
-0.15
hub
-0.15
z
-0.15
humane
-0.14
eref
-0.14
Hub
-0.14
fun
-0.13
ipop
-0.13
POSITIVE LOGITS
uw
0.17
00
0.16
idd
0.15
660
0.15
batis
0.14
æŁ
0.14
66
0.14
ParameterValue
0.14
.datab
0.14
chatt
0.14
Activations Density 0.014%