INDEX
Explanations
phrases related to expressing dissatisfaction and disagreement with others
repetitive phrases or clauses that express ongoing actions or sentiments
New Auto-Interp
Negative Logits
uto
-0.67
¬¼
-0.60
ophon
-0.58
ague
-0.58
irie
-0.56
atari
-0.56
iculty
-0.56
ase
-0.55
Forge
-0.55
DevOnline
-0.55
POSITIVE LOGITS
huh
1.25
..."
1.13
eh
1.13
sir
1.07
]"
1.01
namely
0.95
haha
0.94
\"
0.93
[/
0.91
,''
0.90
Activations Density 0.301%