INDEX
Explanations
references to video game titles and series
New Auto-Interp
Negative Logits
aval
-0.15
ãĥ³ãĥķ
-0.14
.sy
-0.14
thora
-0.14
ãģĮãģĦ
-0.14
.har
-0.14
malé
-0.14
ship
-0.14
.neg
-0.14
CRT
-0.13
POSITIVE LOGITS
Metal
0.27
Solid
0.26
Metal
0.24
Snake
0.23
Solid
0.22
metal
0.21
solid
0.19
Snake
0.19
-solid
0.19
solid
0.19
Activations Density 0.003%