INDEX
Explanations
instances of the word "nothing" and its variations, suggesting a focus on themes of lack or absence
New Auto-Interp
Negative Logits
Thornton
-0.15
mont
-0.15
something
-0.14
adiens
-0.14
ivel
-0.14
antar
-0.14
inel
-0.14
ExecutionContext
-0.14
ME
-0.13
emmel
-0.13
POSITIVE LOGITS
else
0.30
ness
0.28
else
0.23
/no
0.21
Else
0.21
wrong
0.19
ELSE
0.19
_else
0.19
NESS
0.19
else
0.19
Activations Density 0.033%