INDEX
Explanations
phrases indicating a comparison or dual aspect
the conjunction 'and' used in various contexts
New Auto-Interp
Negative Logits
uel
-0.76
Ͻ
-0.74
eks
-0.73
ÃĽ
-0.71
IJ
-0.71
rance
-0.71
³
-0.70
ivid
-0.70
oor
-0.68
iper
-0.66
POSITIVE LOGITS
ours
0.80
nam
0.71
hers
0.70
sexes
0.67
chard
0.66
autobiography
0.65
theirs
0.65
CPI
0.63
nons
0.62
imperson
0.61
Activations Density 0.169%