INDEX
Explanations
instances of significant actions or conditions related to change and development
New Auto-Interp
Negative Logits
this
-0.18
this
-0.17
uru
-0.15
éĤ£æł·
-0.15
116
-0.14
these
-0.14
ham
-0.14
Weiner
-0.14
there
-0.14
those
-0.14
POSITIVE LOGITS
ĶåĽŀ
0.19
гов
0.16
ει
0.15
happens
0.15
ά
0.15
-Star
0.15
ÑĤеÑĢн
0.15
happened
0.15
PKG
0.15
ainer
0.14
Activations Density 0.406%