INDEX
Explanations
phrases indicating reflective thinking or introspection
the repeated use of the pronoun "it" in various contexts
New Auto-Interp
Negative Logits
Angola
-0.63
Friendly
-0.62
Fine
-0.61
dding
-0.60
idon
-0.59
mobile
-0.59
Band
-0.59
ority
-0.57
ggies
-0.57
cox
-0.56
POSITIVE LOGITS
alian
0.94
chy
0.87
transpired
0.86
asca
0.84
pains
0.83
beh
0.82
unes
0.82
rains
0.79
happened
0.78
boils
0.77
Activations Density 0.123%