INDEX
Explanations
This neuron identifies the phrase “family of” (and similar classification constructs).
New Auto-Interp
Negative Logits
{(-0.07
んでいる
-0.06
فنی
-0.06
fu
-0.06
......
-0.06
observer
-0.06
ParticleSystem
-0.06
_Offset
-0.06
>}</
-0.06
371
-0.06
POSITIVE LOGITS
드
0.07
Pokemon
0.06
리
0.06
الاح
0.06
attendees
0.06
geç
0.06
attn
0.06
дет
0.06
Todos
0.06
®
0.06
Activations Density 0.022%