INDEX
Explanations
This neuron activates whenever the token “Air” (or “air”) appears, effectively detecting mentions of “air.”
New Auto-Interp
Negative Logits
['#
-0.07
структу
-0.07
novice
-0.07
Duc
-0.07
Prosecutor
-0.06
NSStringFromClass
-0.06
renovation
-0.06
ugu
-0.06
Rud
-0.06
_pickle
-0.06
POSITIVE LOGITS
air
0.20
Air
0.19
Air
0.16
AIR
0.14
AIR
0.14
air
0.13
-air
0.12
_air
0.11
0.10
airflow
0.10
Activations Density 0.025%