INDEX
Explanations
differences and variations between subjects or items being discussed
New Auto-Interp
Negative Logits
fix
-0.44
Bande
-0.42
Esp
-0.40
Esp
-0.40
nloa
-0.38
Lur
-0.37
ession
-0.37
badger
-0.37
jovens
-0.37
pass
-0.37
POSITIVE LOGITS
differences
1.14
Differences
1.10
Differences
1.02
AnchorStyles
0.98
DIFFER
0.95
Differ
0.93
مشين
0.93
differences
0.93
difference
0.93
bedaan
0.93
Activations Density 0.707%