INDEX
Explanations
expressions of positivity or admiration
New Auto-Interp
Negative Logits
-0.95
okuyayım
-0.78
ModelExpression
-0.77
argout
-0.75
InjectMocks
-0.74
endpush
-0.73
AntiForgeryToken
-0.73
Deum
-0.73
Xf
-0.72
IntentFilter
-0.72
POSITIVE LOGITS
pretty
1.42
Pretty
1.31
Pretty
1.28
pretty
1.27
darn
1.16
PRET
1.00
Fairly
0.96
fairly
0.94
PRET
0.92
retty
0.87
Activations Density 0.046%