INDEX
Explanations
phrases that exhibit a sense of being unrestrained or unbounded
New Auto-Interp
Negative Logits
hops
-0.86
groups
-0.79
roups
-0.78
vertisement
-0.78
ovember
-0.76
rax
-0.75
sters
-0.74
akov
-0.72
apter
-0.71
ington
-0.70
POSITIVE LOGITS
admiration
1.04
vigilance
1.03
devotion
1.00
honesty
0.99
optimism
0.97
loyalty
0.94
reverence
0.92
gratitude
0.91
sunshine
0.90
composure
0.88
Activations Density 0.060%