INDEX
Explanations
transformation
This neuron activates on Polish words (i.e. tokens from Polish text).
New Auto-Interp
Negative Logits
estudio
-0.06
iba
-0.06
twig
-0.06
Cardinal
-0.06
icode
-0.06
vette
-0.06
otypical
-0.06
fleet
-0.06
Rim
-0.06
فتح
-0.06
POSITIVE LOGITS
&p
0.06
Chủ
0.06
ект
0.06
академ
0.06
výši
0.06
Adding
0.06
",@"
0.06
,o
0.06
біль
0.06
Tre
0.06
Activations Density 0.210%