INDEX
Explanations
conversational elements related to acceptance and social dynamics
New Auto-Interp
Negative Logits
rone
-0.15
Moor
-0.14
audi
-0.14
ubi
-0.13
ineligible
-0.13
IBE
-0.13
UMP
-0.13
-lnd
-0.13
ŀ
-0.13
913
-0.13
POSITIVE LOGITS
setBackgroundColor
0.15
Jay
0.14
eton
0.14
rupa
0.14
Cater
0.14
lier
0.13
Eug
0.13
lig
0.13
rek
0.13
Conway
0.13
Activations Density 0.205%