INDEX
Explanations
phrases indicating strong emotions or reactions
intensifiers or expressions emphasizing a strong sentiment
New Auto-Interp
Negative Logits
nings
-0.67
Flavoring
-0.63
looms
-0.61
amac
-0.60
coincides
-0.60
ulia
-0.59
eviction
-0.59
glances
-0.59
女
-0.58
works
-0.57
POSITIVE LOGITS
bered
1.16
ooo
1.07
oths
1.04
oooo
1.02
oooooooo
0.96
zin
0.96
othes
0.93
othe
0.91
oooooooooooooooo
0.89
much
0.85
Activations Density 0.063%