INDEX
Explanations
dialogue and actions related to characters' emotional experiences
New Auto-Interp
Negative Logits
Carson
-0.96
espie
-0.93
bolted
-0.88
helicop
-0.85
Webb
-0.84
loaf
-0.83
ranc
-0.81
Colorado
-0.81
Lyons
-0.80
Nebraska
-0.78
POSITIVE LOGITS
manga
1.52
anime
1.45
ãĢĮ
1.44
Anime
1.41
Yamato
1.38
ãĢİ
1.38
ãĢIJ
1.35
æ
1.35
âĺĨ
1.34
ãĥ
1.33
Activations Density 0.474%