INDEX
Explanations
key concepts and relationships involving personal identity and social roles
New Auto-Interp
Negative Logits
Either
-0.25
either
-0.25
Either
-0.24
either
-0.22
EITHER
-0.20
éĶĭ
-0.17
PFN
-0.16
isoft
-0.15
íĮĮ
-0.15
alendar
-0.15
POSITIVE LOGITS
nor
0.81
nor
0.59
Nor
0.56
NOR
0.54
Nor
0.49
ni
0.38
noch
0.33
nors
0.33
||
0.31
neither
0.29
Activations Density 0.017%