INDEX
Explanations
expressions of desire or intention
want or need
New Auto-Interp
Negative Logits
modelBuilder
-0.68
k
-0.66
woordig
-0.61
Pons
-0.61
FACT
-0.60
Persons
-0.60
Slf
-0.60
ac
-0.59
Raton
-0.58
ַת
-0.58
POSITIVE LOGITS
hoped
0.94
originalmente
0.93
twas
0.88
WANTED
0.87
―――――
0.84
autrefois
0.84
'/')
0.80
던
0.79
€)
0.79
wished
0.79
Activations Density 0.038%