INDEX
Explanations
informal speech and interjections
New Auto-Interp
Negative Logits
Perhaps
0.61
seldom
0.56
或许
0.55
Perhaps
0.54
perhaps
0.49
અન્ય
0.49
하여
0.49
automó
0.48
perhaps
0.48
หาก
0.48
POSITIVE LOGITS
kinda
0.76
really
0.73
guy
0.70
really
0.69
vraiment
0.60
basically
0.58
dude
0.58
ずっと
0.57
なんか
0.56
yeah
0.55
Activations Density 0.009%