INDEX
Explanations
verbs describing being or existence
New Auto-Interp
Negative Logits
rones
-0.77
cease
-0.67
compose
-0.66
culosis
-0.63
ĸļ
-0.63
promise
-0.62
roam
-0.61
Rab
-0.61
occupy
-0.61
mark
-0.61
POSITIVE LOGITS
viewed
1.14
considered
1.09
construed
1.09
regarded
1.06
traced
1.02
likened
1.01
forgiven
1.01
seen
1.00
able
0.98
fitting
0.97
Activations Density 0.999%