INDEX
Explanations
phrases related to opinions or beliefs
instances of the phrase "anyone who" followed by varying actions or characteristics
New Auto-Interp
Negative Logits
tails
-0.72
Delicious
-0.68
urers
-0.67
ãĥĵ
-0.65
Recovery
-0.62
abi
-0.60
segregated
-0.59
Beasts
-0.59
anity
-0.58
Bake
-0.58
POSITIVE LOGITS
knows
0.92
soever
0.85
perty
0.80
wants
0.78
pires
0.77
cares
0.76
reads
0.74
remembers
0.73
lvl
0.72
atever
0.69
Activations Density 0.076%