INDEX
Explanations
educational games
The neuron detects instructional or explanatory language—terms related to teaching, learning, explaining, or providing safety/educational guidance.
New Auto-Interp
Negative Logits
ö
-0.07
datastore
-0.06
arParams
-0.06
럴
-0.06
izarre
-0.06
_:*
-0.06
gregator
-0.06
trium
-0.06
Igor
-0.06
revelations
-0.06
POSITIVE LOGITS
時に
0.07
☴
0.07
(ne
0.06
Sample
0.06
Vari
0.06
not
0.06
],↵
0.06
unpaid
0.06
concerned
0.06
amber
0.06
Activations Density 0.089%