INDEX
Explanations
phrases indicating significant impacts or consequences, particularly in a negative context
New Auto-Interp
Negative Logits
лиÑĪ
-0.16
prest
-0.16
_Impl
-0.15
ÑĢади
-0.14
805
-0.13
695
-0.13
ReadOnly
-0.13
isher
-0.13
oria
-0.13
.svg
-0.13
POSITIVE LOGITS
bearing
0.23
knock
0.22
effect
0.22
Bearing
0.21
bearings
0.20
influence
0.20
adverse
0.18
butterfly
0.17
bearing
0.17
rippling
0.17
Activations Density 0.053%