INDEX
Explanations
phrases related to lack of effort or interest
instances of the word "bother" and its variations
New Auto-Interp
Negative Logits
sung
-0.82
oiler
-0.78
arta
-0.75
arb
-0.74
ramer
-0.73
ophe
-0.73
UE
-0.70
anked
-0.69
reements
-0.68
uts
-0.68
POSITIVE LOGITS
bother
1.09
bothering
1.07
crow
0.88
bothered
0.81
some
0.80
tamp
0.75
fulness
0.74
MENTS
0.74
bothers
0.66
naires
0.65
Activations Density 0.020%