INDEX
Explanations
phrases related to reputation, especially negative connotations
mentions of "reputation"
New Auto-Interp
Negative Logits
cise
-0.87
pheus
-0.77
wed
-0.75
tein
-0.72
nes
-0.72
stad
-0.70
vention
-0.68
phone
-0.67
err
-0.64
apor
-0.64
POSITIVE LOGITS
tarn
0.96
reputation
0.95
bearer
0.81
worthiness
0.77
tremend
0.77
forged
0.75
ously
0.73
internationally
0.73
abroad
0.72
arily
0.71
Activations Density 0.080%