INDEX
Explanations
verbs related to emphasizing or highlighting
terms related to affirmation and emphasis
New Auto-Interp
Negative Logits
inaction
-0.73
ACTION
-0.69
folly
-0.66
indifference
-0.65
captive
-0.64
Reviewer
-0.63
amusement
-0.62
sterling
-0.61
distraction
-0.61
rationality
-0.60
POSITIVE LOGITS
istered
1.16
ited
1.13
ighed
1.12
uates
1.11
pling
1.10
uld
1.06
izes
1.05
tained
1.05
pped
1.05
pled
1.05
Activations Density 0.184%