INDEX
Explanations
expressions of personal opinions and feelings about experiences
New Auto-Interp
Negative Logits
اÙħÛĮÙĨ
-0.15
igen
-0.14
jom
-0.14
æĹıèĩªæ²»
-0.14
Å¥
-0.14
Çİ
-0.14
ovy
-0.14
ErrorException
-0.13
ood
-0.13
lox
-0.13
POSITIVE LOGITS
more
0.38
more
0.28
enough
0.27
less
0.26
MORE
0.25
mehr
0.25
болÑĮÑĪе
0.23
most
0.23
más
0.23
бÑĸлÑĮÑĪе
0.22
Activations Density 0.123%