INDEX
Explanations
verbs or phrases denoting amounts or quantities
phrases that describe actions or consequences that "amount to" something significant
New Auto-Interp
Negative Logits
arts
-0.81
utherland
-0.76
rers
-0.75
rot
-0.74
atu
-0.74
adjusts
-0.72
rose
-0.72
notations
-0.71
rients
-0.70
soon
-0.70
POSITIVE LOGITS
heresy
1.09
blasphemy
1.00
treason
0.98
betrayal
0.98
manslaughter
0.97
genocide
0.94
extortion
0.92
bribery
0.91
blackmail
0.89
vandalism
0.87
Activations Density 0.148%