INDEX
Explanations
references to video game characters or titles
references to video game titles and characters
New Auto-Interp
Negative Logits
rique
-0.92
coal
-0.72
ique
-0.72
uate
-0.69
insula
-0.67
utility
-0.66
rons
-0.65
most
-0.63
aution
-0.62
tern
-0.62
POSITIVE LOGITS
lda
0.98
Zelda
0.94
Metroid
0.84
Wii
0.83
âĦ¢:
0.76
weed
0.76
HAM
0.76
Bros
0.73
Pg
0.73
Craft
0.72
Activations Density 0.015%