INDEX
Explanations
phrases related to consequences and risks of actions or decisions
New Auto-Interp
Negative Logits
umbs
-0.15
vej
-0.15
Lum
-0.15
urance
-0.14
lum
-0.14
ullo
-0.14
Setter
-0.14
éĿ
-0.13
anas
-0.13
iams
-0.13
POSITIVE LOGITS
rebound
0.26
rebounds
0.26
back
0.25
backlash
0.23
bite
0.22
bo
0.21
-back
0.19
BACK
0.19
biting
0.19
Back
0.19
Activations Density 0.168%