INDEX
Negative Logits
Null
-0.07
Sand
-0.07
Constraints
-0.06
_wind
-0.06
ACITY
-0.06
Wand
-0.06
asad
-0.06
(stdout
-0.06
Ernst
-0.06
Naughty
-0.06
POSITIVE LOGITS
appreciate
0.12
appreciated
0.11
appreciation
0.11
apprec
0.08
repair
0.08
agre
0.07
respecting
0.07
↵ ↵
0.07
Apprec
0.07
。↵
0.07
Activations Density 0.012%