INDEX
Explanations
phrases related to comparison and equivalence
New Auto-Interp
Negative Logits
inya
-0.15
itra
-0.15
ering
-0.15
swer
-0.15
UFF
-0.15
afone
-0.14
appa
-0.14
antee
-0.14
uff
-0.14
ãĥĵãĥ¼
-0.14
POSITIVE LOGITS
maybe
0.44
maybe
0.40
çĶļèĩ³
0.35
possibly
0.32
Maybe
0.32
Maybe
0.31
ä¹ĥ
0.31
perhaps
0.30
indeed
0.28
vielleicht
0.28
Activations Density 0.157%