INDEX
Explanations
references to emotional experiences and familial relationships
New Auto-Interp
Negative Logits
Outside
-0.15
Outside
-0.15
azzi
-0.14
ÑĤаки
-0.14
outside
-0.14
Previous
-0.13
_within
-0.13
åIJij
-0.13
ASA
-0.13
.bukkit
-0.13
POSITIVE LOGITS
back
0.50
based
0.40
Back
0.38
Back
0.36
back
0.36
BACK
0.33
-back
0.32
based
0.32
Based
0.32
_back
0.32
Activations Density 0.035%