INDEX
Explanations
instances of significant emotional or relational connection between characters
New Auto-Interp
Negative Logits
Rivera
-0.17
sur
-0.15
atic
-0.15
à¥įरद
-0.15
pard
-0.15
region
-0.14
hood
-0.14
iner
-0.14
922
-0.14
gre
-0.14
POSITIVE LOGITS
ushima
0.15
inki
0.15
eren
0.15
à¸ļล
0.15
abr
0.14
omore
0.14
hell
0.14
okit
0.14
ÛĮزÛĮ
0.14
ãĥ³ãĥĶ
0.14
Activations Density 0.002%