INDEX
Explanations
references to interactions and relationships among individuals
New Auto-Interp
Negative Logits
laughter
-0.15
оном
-0.15
λον
-0.14
Yesterday
-0.14
adu
-0.14
}elseif
-0.14
uron
-0.14
ehr
-0.13
/sn
-0.13
iento
-0.13
POSITIVE LOGITS
then
0.21
then
0.20
soon
0.17
then
0.17
Then
0.16
THEN
0.15
ippers
0.15
SOLE
0.15
wart
0.15
entonces
0.15
Activations Density 0.222%