INDEX
Explanations
Researchers tested, considering switching
The neuron detects personal names and name-like placeholders/proper nouns (e.g., person names or bracketed name fields).
New Auto-Interp
Negative Logits
percentage
0.42
или
0.37
ab
0.37
reszt
0.37
percentages
0.37
거고
0.37
或
0.36
либо
0.36
belum
0.35
あるいは
0.35
POSITIVE LOGITS
सवारी
0.35
tối
0.34
ஓர்
0.33
Zoological
0.32
有个
0.32
sejumlah
0.32
वारदात
0.32
Беларусі
0.32
发育
0.31
sonar
0.31
Activations Density 0.366%