INDEX
Explanations
phrases indicating relative positions (such as "above" and "below")
New Auto-Interp
Negative Logits
cal
-0.71
Portail
-0.68
riwal
-0.66
er
-0.64
conv
-0.64
man
-0.63
ClickListener
-0.63
ril
-0.63
eral
-0.60
cal
-0.59
POSITIVE LOGITS
ABOVE
2.18
Above
2.17
Above
2.09
above
2.06
above
2.05
ABOVE
2.00
bove
1.79
BELOW
1.61
Below
1.51
below
1.48
Activations Density 0.078%