INDEX
Explanations
questions related to reasoning or cause
questions and inquiries
New Auto-Interp
Negative Logits
blot
-0.77
apixel
-0.74
imet
-0.69
hold
-0.68
ura
-0.65
booth
-0.65
podium
-0.63
borg
-0.63
satell
-0.63
olor
-0.62
POSITIVE LOGITS
Because
1.33
Because
1.30
Reason
1.21
WHY
1.17
Cause
1.15
Reasons
1.15
Why
1.11
reasons
1.04
Firstly
1.02
Simple
1.01
Activations Density 0.241%