INDEX
Explanations
phrases related to procedural instructions or guidelines
New Auto-Interp
Negative Logits
להם
-0.56
,
-0.55
👈
-0.54
டும்
-0.52
po
-0.51
eléctricos
-0.50
–
-0.49
numerusform
-0.49
apie
-0.49
ซ์
-0.49
POSITIVE LOGITS
itſelf
1.05
Anſ
0.94
myſelf
0.93
ſelves
0.91
دانشنامهٔ
0.90
Houſe
0.87
Majefty
0.85
ſelf
0.84
0.81
neſs
0.81
Activations Density 0.112%