INDEX
Explanations
The neuron detects the “viewed in the light most favorable to [the X]” phrase used to describe the standard‐of‐review for evidence.
New Auto-Interp
Negative Logits
报告
-0.07
_runner
-0.07
conte
-0.07
Homepage
-0.06
دور
-0.06
439
-0.06
passwd
-0.06
Cri
-0.06
Looking
-0.06
uper
-0.06
POSITIVE LOGITS
заболеваний
0.07
dro
0.07
πολι
0.07
/dev
0.07
greg
0.06
<Button
0.06
marine
0.06
κυ
0.06
Apple
0.06
(ValueError
0.06
Activations Density 0.003%