INDEX
Explanations
The neuron activates on mentions of the Indian Premier League (IPL) and related league identifiers.
New Auto-Interp
Negative Logits
padded
-0.07
="'
-0.07
Seznam
-0.06
keinen
-0.06
찬
-0.06
障
-0.06
sincerity
-0.06
절
-0.06
predicate
-0.06
deposit
-0.06
POSITIVE LOGITS
.general
0.06
*S
0.06
(MethodImplOptions
0.06
ोफ
0.06
(↵↵
0.06
ersh
0.06
'",↵
0.06
viol
0.06
ฑ
0.06
']]↵
0.06
Activations Density 0.004%