INDEX
Explanations
phrases related to quality assessment or approval
expressions of unfavorable evaluations or criticisms
New Auto-Interp
Negative Logits
beit
-0.68
irresist
-0.65
Neuroscience
-0.63
lessness
-0.62
iland
-0.62
ription
-0.60
generously
-0.60
Heath
-0.59
onomy
-0.58
rather
-0.58
POSITIVE LOGITS
anymore
1.43
nor
1.08
ãĤ®
0.83
Ú
0.82
ÙIJ
0.81
yet
0.74
yet
0.72
either
0.70
vier
0.69
unless
0.69
Activations Density 0.362%