INDEX
Explanations
words related to quitting a job
New Auto-Interp
Negative Logits
inen
-0.82
arov
-0.76
Catalog
-0.73
translation
-0.70
Stretch
-0.67
feat
-0.67
Redditor
-0.67
yah
-0.65
ANS
-0.65
guid
-0.62
POSITIVE LOGITS
smoking
1.24
Smoking
1.02
abruptly
0.92
smoking
0.91
ting
0.88
ters
0.84
quitting
0.84
altogether
0.83
voluntarily
0.79
Quit
0.78
Activations Density 0.021%