INDEX
Explanations
references to singular and plural forms of the word "one" or its variations
New Auto-Interp
Negative Logits
land
-0.20
one
-0.18
thane
-0.17
cape
-0.17
ron
-0.16
walker
-0.16
ä¸Ģä¸ĭ
-0.16
ÚĨÙĩ
-0.16
Ùĩا
-0.16
th
-0.16
POSITIVE LOGITS
-sided
0.27
-third
0.26
ida
0.26
-eyed
0.26
-dimensional
0.26
-handed
0.23
onta
0.23
idas
0.23
-bedroom
0.22
-way
0.21
Activations Density 0.067%