INDEX
Explanations
phrases emphasizing exclusivity or limitation
New Auto-Interp
Negative Logits
帖最后由
-0.53
__':
-0.52
Diwedd
-0.50
あえず
-0.46
esModule
-0.46
illion
-0.45
__':
-0.45
profess
-0.42
Heron
-0.42
Qua
-0.41
POSITIVE LOGITS
Pourtant
0.51
mere
0.45
pourtant
0.44
slechts
0.44
endnu
0.43
仅仅
0.41
ändå
0.40
yet
0.40
enfans
0.40
ännu
0.40
Activations Density 0.396%