INDEX
Explanations
The neuron fires on strongly evaluative adjectives or degree words (e.g. “great,” “worst”) that mark extreme or superlative descriptions.
New Auto-Interp
Negative Logits
딸
-0.07
priv
-0.07
Assigned
-0.07
Rotation
-0.06
subdiv
-0.06
ients
-0.06
CONSTRAINT
-0.06
alled
-0.06
UpdatedAt
-0.06
Velocity
-0.06
POSITIVE LOGITS
이
0.06
-res
0.06
athi
0.06
apiUrl
0.06
toHaveBeenCalledTimes
0.06
Стар
0.06
tá
0.06
emlrt
0.06
jails
0.06
282
0.06
Activations Density 0.120%