INDEX
Explanations
statements expressing personal opinions or beliefs
New Auto-Interp
Negative Logits
liable
-0.15
.restaurant
-0.15
_OW
-0.14
avian
-0.14
جÙĩ
-0.14
(Collision
-0.14
tell
-0.13
\Controllers
-0.13
UNCH
-0.13
rometer
-0.13
POSITIVE LOGITS
respect
0.21
perfectly
0.20
respect
0.17
respects
0.17
appings
0.16
å°Ĭ
0.15
fully
0.15
respecting
0.15
totally
0.15
boro
0.15
Activations Density 0.098%