INDEX
Explanations
phrases starting with "Those"
repetitions of the word "Those."
New Auto-Interp
Negative Logits
ILY
-0.80
kamp
-0.79
efully
-0.77
osate
-0.77
PLA
-0.77
HY
-0.75
abis
-0.75
ob
-0.74
enegger
-0.74
¨
-0.69
POSITIVE LOGITS
pesky
1.07
wishing
1.00
who
0.94
kinds
0.93
sentiments
0.86
wanting
0.84
statements
0.84
thoughts
0.83
feelings
0.82
distinctions
0.81
Activations Density 0.071%