INDEX
Explanations
the presence of the word "pros" or related terms indicating advantages
New Auto-Interp
Negative Logits
een
-0.17
endon
-0.17
Occurrences
-0.16
ëĭĪìĬ¤
-0.15
istle
-0.15
tainment
-0.15
utschen
-0.15
eed
-0.14
vert
-0.14
ntax
-0.14
POSITIVE LOGITS
pective
0.32
pects
0.29
pector
0.29
pros
0.23
ely
0.22
cons
0.21
PECT
0.20
pecting
0.20
ody
0.19
ively
0.19
Activations Density 0.009%