INDEX
Explanations
adjectives and descriptive phrases that convey positivity or quality
New Auto-Interp
Negative Logits
disastrous
-0.54
ksis
-0.52
<bos>
-0.52
unsatisfactory
-0.51
Awful
-0.47
feeble
-0.47
perilous
-0.45
Unavailable
-0.44
ánica
-0.43
wretched
-0.43
POSITIVE LOGITS
addGap
0.72
clean
0.70
êques
0.69
istoitu
0.67
AssemblyTitle
0.65
wapV
0.64
SharedDtor
0.64
Wikimedijinoj
0.64
ngdoc
0.62
oredCriteria
0.62
Activations Density 0.115%