INDEX
Explanations
information related to politics, global events, and ethical/moral issues
New Auto-Interp
Negative Logits
inver
-0.77
ilet
-0.69
dry
-0.68
itch
-0.64
waters
-0.63
cise
-0.62
itches
-0.61
flix
-0.58
vernight
-0.56
spoilers
-0.56
POSITIVE LOGITS
incurred
1.06
accrued
1.06
bestowed
1.04
afforded
1.01
emanating
0.94
wrought
0.91
arising
0.89
undertaken
0.89
generated
0.88
inflicted
0.87
Activations Density 2.760%