INDEX
Explanations
phrases that indicate enabling or allowing actions
New Auto-Interp
Negative Logits
<?
-0.54
wantErr
-0.47
initComponents
-0.47
informée
-0.46
ostante
-0.44
disambiguazione
-0.44
发表于
-0.43
#+#
-0.43
تكبرها
-0.42
provoquer
-0.42
POSITIVE LOGITS
easily
1.00
easily
0.90
facilement
0.85
facilmente
0.77
freely
0.75
fácilmente
0.73
Easily
0.69
safely
0.69
Easily
0.68
confidently
0.67
Activations Density 0.445%