INDEX
Explanations
modals and expressions of ability or capacity
New Auto-Interp
Negative Logits
vouloir
-0.70
defaultstate
-0.58
wanting
-0.58
Want
-0.58
needing
-0.56
disfraz
-0.54
gillar
-0.53
såsom
-0.53
ganas
-0.53
preferring
-0.53
POSITIVE LOGITS
easily
0.99
freely
0.86
comfortably
0.79
afford
0.77
easily
0.73
effectively
0.73
confidently
0.72
safely
0.71
properly
0.69
truly
0.69
Activations Density 0.400%