INDEX
Explanations
adverbs and their variations that describe the manner of actions
New Auto-Interp
Negative Logits
erto
-0.77
icip
-0.76
oresc
-0.74
ordan
-0.70
ulton
-0.65
rieving
-0.64
raltar
-0.64
HAEL
-0.63
iliate
-0.63
kowski
-0.63
POSITIVE LOGITS
expensive
0.80
inaccurate
0.77
close
0.76
awful
0.75
ugly
0.74
pes
0.73
sized
0.73
inept
0.72
accurate
0.72
cheap
0.72
Activations Density 0.041%