INDEX
Explanations
sentences or phrases contrasting different aspects or options
New Auto-Interp
Negative Logits
hire
-0.67
ª
-0.65
clud
-0.61
thouse
-0.60
endor
-0.59
aii
-0.58
vest
-0.57
okia
-0.56
orc
-0.55
arag
-0.55
POSITIVE LOGITS
depending
1.76
Either
1.47
respectively
1.41
whichever
1.38
depending
1.38
Both
1.26
Either
1.20
Both
1.14
ichever
1.08
Whatever
1.07
Activations Density 0.902%