INDEX
Explanations
words related to predictions and assumptions
New Auto-Interp
Negative Logits
stood
-0.68
dylib
-0.66
Monitor
-0.66
ebted
-0.65
ologue
-0.63
Reporting
-0.61
Gallery
-0.60
Ow
-0.60
watches
-0.60
Tables
-0.59
POSITIVE LOGITS
counterproductive
1.14
fraught
1.12
preferable
1.07
frowned
1.02
futile
1.02
taboo
1.01
advisable
1.00
paramount
0.97
pointless
0.97
folly
0.96
Activations Density 1.543%