INDEX
Explanations
keywords related to negation, prohibition, or rejection
references to the word "any" in various contexts
New Auto-Interp
Negative Logits
rex
-0.86
gypt
-0.81
staking
-0.76
rez
-0.72
Originally
-0.70
romy
-0.70
expensive
-0.69
raid
-0.68
itals
-0.67
artments
-0.66
POSITIVE LOGITS
semblance
1.14
THING
1.13
conceivable
1.08
attempt
1.01
resemblance
0.98
kind
0.95
WHERE
0.95
deviation
0.93
body
0.92
meaningful
0.91
Activations Density 0.079%