INDEX
Explanations
references to specific characters and transformative actions in gaming and narratives
New Auto-Interp
Negative Logits
rus
-0.18
etary
-0.18
eros
-0.17
dle
-0.16
lectric
-0.16
zers
-0.15
gear
-0.15
hard
-0.15
amma
-0.15
lect
-0.15
POSITIVE LOGITS
itect
0.26
edly
0.22
itecture
0.21
red
0.19
adia
0.18
pNet
0.17
lings
0.16
aic
0.16
'ın
0.16
¶Į
0.16
Activations Density 0.009%