INDEX
Explanations
instances where the concept of loyalty or faithfulness is mentioned
phrases expressing allegiance or loyalty
New Auto-Interp
Negative Logits
llo
-0.80
TPPStreamerBot
-0.70
employed
-0.69
going
-0.68
ersed
-0.68
rio
-0.67
waived
-0.65
headed
-0.65
sky
-0.64
elig
-0.64
POSITIVE LOGITS
wered
0.92
coincide
0.89
appease
0.86
ensure
0.84
conserve
0.83
ggles
0.82
signify
0.82
pless
0.79
days
0.79
minimize
0.78
Activations Density 0.262%