INDEX
Explanations
phrases related to controversial or sensitive topics
statements about significant conditions or trends
New Auto-Interp
Negative Logits
Awakens
-0.76
lashes
-0.71
LV
-0.69
Auto
-0.65
Pats
-0.64
retrie
-0.63
warranties
-0.62
attaches
-0.61
watches
-0.61
Norn
-0.61
POSITIVE LOGITS
waning
1.05
manifold
1.02
negligible
0.98
dwindling
0.98
immense
0.97
widespread
0.96
undeniable
0.96
enormous
0.96
largely
0.94
unparalleled
0.93
Activations Density 0.217%