INDEX
Explanations
words related to speculation or likelihood
New Auto-Interp
Negative Logits
estern
-0.94
ilts
-0.74
ests
-0.72
atform
-0.71
restling
-0.67
ioch
-0.66
ainers
-0.65
leasing
-0.63
iership
-0.62
oning
-0.61
POSITIVE LOGITS
innocuous
0.93
awfully
0.92
oddly
0.92
strangely
0.88
poised
0.83
rils
0.81
like
0.81
unstoppable
0.80
unlikely
0.77
suspicious
0.76
Activations Density 0.052%