INDEX
Explanations
phrases related to ambiguity and contrasting viewpoints in discussions
New Auto-Interp
Negative Logits
çͱäºİ
-0.16
omanip
-0.16
Due
-0.15
majority
-0.14
wort
-0.14
umhur
-0.13
utom
-0.13
Äĥn
-0.13
ioneer
-0.13
-Clause
-0.13
POSITIVE LOGITS
few
0.24
taken
0.23
such
0.20
few
0.20
Taken
0.19
Taken
0.19
viewed
0.18
Few
0.18
Few
0.17
context
0.17
Activations Density 0.421%