INDEX
Explanations
phrases indicating accuracy or precision
instances of the word "accurately."
New Auto-Interp
Negative Logits
necessity
-0.73
favorites
-0.72
Redemption
-0.72
tropes
-0.72
inaction
-0.71
reaction
-0.71
willingness
-0.70
approvals
-0.69
taboo
-0.69
hegemony
-0.69
POSITIVE LOGITS
imated
0.95
ãĤ©
0.93
compensated
0.85
modeled
0.83
reproduced
0.82
correct
0.79
paced
0.78
etitive
0.77
represented
0.77
addressed
0.77
Activations Density 0.021%