INDEX
Explanations
instances of the word "whether" and explores the subsequent clauses
questions regarding societal norms and conditions
New Auto-Interp
Negative Logits
nowhere
-0.82
ãĤĴ
-0.71
ciating
-0.71
none
-0.70
IGH
-0.69
nothing
-0.62
unknown
-0.62
ulner
-0.60
licts
-0.59
Thrones
-0.59
POSITIVE LOGITS
ever
1.38
anymore
1.10
EVER
1.08
qualifies
0.99
adequately
0.99
ever
0.96
necessarily
0.92
any
0.90
actually
0.87
properly
0.86
Activations Density 0.281%