INDEX
Explanations
phrases indicating an evaluation of the balance between different factors
comparisons of advantages and disadvantages
New Auto-Interp
Negative Logits
flix
-0.69
kan
-0.69
oker
-0.67
nn
-0.64
hol
-0.63
Zah
-0.62
olt
-0.62
bj
-0.61
iard
-0.61
cu
-0.61
POSITIVE LOGITS
ours
0.79
expectations
0.79
comprehension
0.73
usual
0.71
feelings
0.70
productivity
0.70
actual
0.68
theirs
0.67
seriousness
0.66
profits
0.65
Activations Density 0.284%