INDEX
Explanations
questions marked by a question mark
New Auto-Interp
Negative Logits
destro
-0.82
©¶æ
-0.80
unbeliev
-0.71
confir
-0.70
mutually
-0.69
spons
-0.69
rein
-0.69
encount
-0.68
aku
-0.68
ingest
-0.68
POSITIVE LOGITS
Surely
1.15
Where
1.13
Well
1.10
Answer
1.05
Certainly
1.04
Why
1.04
Probably
1.03
Does
1.02
What
1.02
Possibly
1.00
Activations Density 0.078%