INDEX
Explanations
dialogue and conversational exchanges
Text after punctuation/parenthesis
jokes, laughter, or humor
New Auto-Interp
Negative Logits
Chham
-0.36
IContainer
-0.34
habad
-0.33
sifs
-0.32
Vidite
-0.32
Grit
-0.31
доступ
-0.31
Divers
-0.30
meiras
-0.30
OnInit
-0.30
POSITIVE LOGITS
الحره
0.59
plegable
0.56
laughter
0.54
SharedCtor
0.52
jouets
0.51
RegressionTest
0.51
joking
0.50
yürü
0.49
joke
0.49
laughing
0.49
Activations Density 0.545%