INDEX
Explanations
phrases indicating social or relational dynamics
New Auto-Interp
Negative Logits
AGO
-0.17
automáticamente
-0.16
lou
-0.15
ÑĢож
-0.15
ILER
-0.14
odore
-0.14
{}.-0.14
.mods
-0.14
/Foundation
-0.14
automatic
-0.13
POSITIVE LOGITS
stype
0.16
ham
0.15
tradition
0.15
flush
0.14
utsch
0.14
bd
0.14
hof
0.14
BD
0.13
Cop
0.13
iang
0.13
Activations Density 0.065%