INDEX
Explanations
The neuron activates on mentions of “witness” (in forms like “Witness,” “Witnesses,” etc.).
New Auto-Interp
Negative Logits
Small
-0.07
Valencia
-0.07
Default
-0.07
Small
-0.07
На
-0.06
<Model
-0.06
42
-0.06
Essex
-0.06
óa
-0.06
升
-0.06
POSITIVE LOGITS
witness
0.15
witnesses
0.14
Witness
0.13
witnessed
0.12
Witness
0.11
eyewitness
0.10
Witnesses
0.09
witnessing
0.09
itness
0.08
IPAddress
0.07
Activations Density 0.005%