INDEX
Explanations
The neuron activates on occurrences of the word “data.”
New Auto-Interp
Negative Logits
Rus
-0.08
tratamiento
-0.07
электри
-0.07
pring
-0.07
μπ
-0.07
.Paths
-0.06
будто
-0.06
.Class
-0.06
\Action
-0.06
элек
-0.06
POSITIVE LOGITS
data
0.10
Data
0.07
DATA
0.07
historical
0.07
Dean
0.06
cán
0.06
deux
0.06
statistics
0.06
NX
0.06
dung
0.06
Activations Density 0.034%