INDEX
Explanations
instances where the speaker is addressing or asking something of another person
New Auto-Interp
Negative Logits
Byrne
-0.15
hazi
-0.15
Carbon
-0.15
carbon
-0.15
yla
-0.14
dera
-0.14
nze
-0.14
iset
-0.14
oston
-0.14
ining
-0.14
POSITIVE LOGITS
çŃĴ
0.14
alic
0.14
Äįer
0.14
orie
0.14
oth
0.14
ortho
0.14
_male
0.14
Morr
0.13
.define
0.13
_USAGE
0.13
Activations Density 0.018%