INDEX
Explanations
hesitation and filler words
New Auto-Interp
Negative Logits
ißler
0.39
петров
0.38
gór
0.37
üedad
0.37
namespaces
0.37
जीटी
0.37
Kerk
0.37
чай
0.36
കേ
0.36
OLO
0.36
POSITIVE LOGITS
yeah
0.71
huh
0.68
…
0.64
hh
0.62
...
0.59
oh
0.58
hmm
0.58
..
0.56
hm
0.55
hmm
0.54
Activations Density 0.008%