INDEX
Explanations
language indicating specificity and detail
phrases related to the act of specifying or detailing information
New Auto-Interp
Negative Logits
Think
-0.76
tumblr
-0.70
assic
-0.68
imum
-0.65
kt
-0.64
Throw
-0.64
nee
-0.61
irt
-0.60
bees
-0.59
aturated
-0.59
POSITIVE LOGITS
specifics
1.28
nor
0.95
details
0.93
particulars
0.90
formally
0.84
specific
0.83
definitively
0.82
exact
0.81
anymore
0.81
publicly
0.80
Activations Density 0.226%