INDEX
Explanations
adjectives describing a positive evaluation or opinion
positive adjectives expressing high quality or excellence
New Auto-Interp
Negative Logits
etus
-0.67
ournal
-0.65
ilings
-0.65
ibliography
-0.65
ravings
-0.64
earance
-0.64
uthor
-0.63
idas
-0.62
uke
-0.61
pend
-0.61
POSITIVE LOGITS
enough
1.02
ãĤ´
0.78
hubs
0.75
nell
0.75
aneously
0.72
nonetheless
0.66
XD
0.66
additions
0.65
surpr
0.65
Dahl
0.64
Activations Density 0.207%