INDEX
Explanations
phrases indicating actions or decisions to be taken
various forms of verbs and their implications in different contexts
New Auto-Interp
Negative Logits
Malays
-0.62
Malaysia
-0.57
Drag
-0.52
atl
-0.51
©¶æ
-0.50
ships
-0.49
Shar
-0.49
Bild
-0.48
Lars
-0.48
KL
-0.48
POSITIVE LOGITS
temptation
0.67
indistinguishable
0.59
endif
0.55
âĵĺ
0.54
elig
0.54
morrow
0.54
'."
0.53
.'"
0.53
goto
0.52
etheless
0.52
Activations Density 1.284%