INDEX
Explanations
names of characters and their relationships in dialogues
New Auto-Interp
Negative Logits
autorytatywna
-0.90
aarrggbb
-0.71
שוליים
-0.68
ddots
-0.62
ligiloj
-0.60
ituary
-0.59
httphttps
-0.58
Geplaatst
-0.58
ChildScrollView
-0.57
Paglinawan
-0.57
POSITIVE LOGITS
kissed
0.59
tagext
0.58
mumbled
0.54
fainted
0.53
grinned
0.51
jakiś
0.51
leaned
0.50
thâu
0.50
crouched
0.50
protested
0.49
Activations Density 0.144%