INDEX
Explanations
phrases expressing a willingness to provide assistance or further details
New Auto-Interp
Negative Logits
juges
-0.42
utnik
-0.40
zzar
-0.40
armées
-0.40
knap
-0.40
пен
-0.40
Menge
-0.38
ME
-0.38
loyment
-0.37
yorsunuz
-0.37
POSITIVE LOGITS
whatever
0.81
gerne
0.81
=$?
0.76
whatever
0.76
gärna
0.76
Willing
0.75
whichever
0.73
willing
0.72
ويكيميديا
0.72
anything
0.71
Activations Density 0.222%