INDEX
Explanations
references to conditions and evaluations that claim increased efficacy or success
New Auto-Interp
Negative Logits
invece
-0.76
either
-0.73
Either
-0.73
either
-0.71
Either
-0.71
simplesmente
-0.65
então
-0.64
inoltre
-0.64
prostu
-0.62
simply
-0.61
POSITIVE LOGITS
technically
1.28
nominally
1.13
admittedly
1.07
ostensibly
1.02
itſelf
0.97
misschien
0.96
undoubtedly
0.95
theoretically
0.93
certes
0.91
may
0.91
Activations Density 0.556%