INDEX
Explanations
instances of the word "ast," indicating a focus on negative or harsh events
New Auto-Interp
Negative Logits
BOOK
-0.67
Ib
-0.63
BY
-0.59
Sochi
-0.58
Forth
-0.57
Finder
-0.56
CTR
-0.56
pheus
-0.55
largeDownload
-0.55
BILITIES
-0.55
POSITIVE LOGITS
eful
1.08
ening
0.92
liest
0.91
ened
0.91
liness
0.90
ener
0.86
eland
0.84
iest
0.82
eners
0.81
est
0.80
Activations Density 0.009%