INDEX
Explanations
phrases that indicate universality or all-encompassing concepts
New Auto-Interp
Negative Logits
roz
-0.17
iteit
-0.16
Daly
-0.16
Misc
-0.15
inn
-0.15
Murdoch
-0.15
Birch
-0.15
gravity
-0.15
apia
-0.14
alc
-0.14
POSITIVE LOGITS
ÙĪÙĤت
0.18
æĹ¶
0.17
вÑĢемени
0.17
lúc
0.17
times
0.16
à¹Ģส
0.16
кÑĤа
0.16
æĹ¶åĢĻ
0.16
wick
0.16
thá»Ŀi
0.16
Activations Density 0.029%