INDEX
Explanations
terms related to levels of intensity or significance
adverbs that describe the manner of actions or conditions
New Auto-Interp
Negative Logits
cade
-0.72
arya
-0.71
eto
-0.67
Majority
-0.67
retty
-0.67
Voy
-0.64
uese
-0.64
Timbers
-0.64
hyde
-0.63
Payne
-0.63
POSITIVE LOGITS
spaced
0.90
protected
0.77
constructed
0.75
educated
0.73
insignificant
0.73
vaccinated
0.71
overlapping
0.70
defined
0.70
separated
0.70
placed
0.70
Activations Density 0.166%