INDEX
Explanations
HTML table elements and attributes
New Auto-Interp
Negative Logits
ajo
-0.18
ngo
-0.16
abbo
-0.15
ÎłÎ¿Î»
-0.15
iets
-0.15
-threat
-0.15
krv
-0.15
affen
-0.14
ÏĦιÏĥ
-0.14
Ģìŀ¥
-0.14
POSITIVE LOGITS
mony
0.17
лев
0.16
Neutral
0.15
اÙĦÙĪ
0.15
sublicense
0.14
ANDING
0.14
ament
0.13
alian
0.13
neutral
0.13
wal
0.13
Activations Density 0.020%