INDEX
Explanations
game-related instructions and information
New Auto-Interp
Negative Logits
lihood
-0.61
Tuls
-0.57
Sven
-0.57
Pruitt
-0.54
Ballard
-0.54
Kush
-0.53
Greenpeace
-0.53
OSH
-0.52
Nun
-0.52
Roose
-0.51
POSITIVE LOGITS
ilit
0.65
ivia
0.63
english
0.60
unte
0.60
sylv
0.58
SourceFile
0.57
lucent
0.56
rote
0.56
rete
0.53
¶æ
0.52
Activations Density 3.085%