INDEX
Explanations
emotional states and reactions related to interpersonal relationships
New Auto-Interp
Negative Logits
ultan
-0.16
upo
-0.16
onor
-0.15
mino
-0.15
ardu
-0.14
quette
-0.14
بÙĪØ§Ø³Ø·Ø©
-0.14
ibaba
-0.14
aginator
-0.13
atha
-0.13
POSITIVE LOGITS
expressed
0.24
express
0.23
misplaced
0.23
shared
0.20
Express
0.19
express
0.18
stronger
0.18
Shared
0.18
border
0.17
shared
0.17
Activations Density 0.293%