INDEX
Explanations
Korean dramas
The neuron fires on tokens that are part of TV or movie titles—especially the content words of Korean drama titles.
New Auto-Interp
Negative Logits
страны
-0.06
Treasurer
-0.06
住
-0.06
�
-0.06
rail
-0.06
Sass
-0.06
.mod
-0.05
XS
-0.05
gf
-0.05
avail
-0.05
POSITIVE LOGITS
-results
0.07
>()↵↵
0.07
iko
0.07
۱۹۴
0.07
zvý
0.06
ไทย
0.06
_()↵
0.06
Bene
0.06
исключ
0.06
منابع
0.06
Activations Density 0.000%