INDEX
Explanations
adverbs that modify actions with a positive connotation
adverbs that describe manner or intensity
New Auto-Interp
Negative Logits
enf
-0.68
eries
-0.62
ilater
-0.61
hao
-0.61
totally
-0.61
rite
-0.61
privilege
-0.60
outright
-0.60
willfully
-0.58
ado
-0.57
POSITIVE LOGITS
enough
0.90
throughout
0.86
thereafter
0.84
enough
0.84
tics
0.77
utics
0.74
into
0.74
during
0.74
ILCS
0.73
afterwards
0.73
Activations Density 0.159%