INDEX
Explanations
interaction
This neuron activates on occurrences of the word “interaction” (and its variants) in text.
New Auto-Interp
Negative Logits
calcul
-0.08
schedule
-0.08
launch
-0.07
pid
-0.07
scheduled
-0.07
calculates
-0.07
latitude
-0.07
planning
-0.07
planned
-0.07
(column
-0.07
POSITIVE LOGITS
interact
0.12
Interaction
0.09
interacts
0.09
interaction
0.08
interacting
0.08
_mob
0.08
interaction
0.08
орот
0.07
交流
0.07
тим
0.07
Activations Density 0.023%