INDEX
Explanations
phrases indicating storytelling or sharing information
phrases that emphasize direct communication or statements directed at someone
New Auto-Interp
Negative Logits
hement
-0.59
©¶æ¥µ
-0.59
iw
-0.59
hift
-0.58
luster
-0.55
osp
-0.54
untarily
-0.54
recourse
-0.53
ibly
-0.53
calling
-0.53
POSITIVE LOGITS
guys
1.24
somet
0.87
anecd
0.80
why
0.78
yourselves
0.77
're
0.75
firsthand
0.75
something
0.73
bluntly
0.71
dudes
0.70
Activations Density 0.029%