INDEX
Explanations
instances where text mentions an excessive amount or intensity of something
New Auto-Interp
Negative Logits
byn
-0.86
saf
-0.83
selves
-0.82
Ń·
-0.81
stead
-0.79
İĭ
-0.77
ands
-0.77
iversary
-0.73
psons
-0.73
sylv
-0.73
POSITIVE LOGITS
emphasis
0.92
inconsistency
0.86
attention
0.85
inventoryQuantity
0.82
trouble
0.82
effort
0.81
hassle
0.80
hype
0.79
temptation
0.78
dmg
0.76
Activations Density 10.640%