INDEX
Explanations
references to adaptations and similarities between stories or characters
New Auto-Interp
Negative Logits
ä¼´
-0.16
afür
-0.16
accompanying
-0.16
å·»
-0.15
ÏĢον
-0.14
accompany
-0.14
ãĥ¼ãĥĪ
-0.14
oux
-0.14
empo
-0.14
лÑĮÑĤ
-0.13
POSITIVE LOGITS
borrow
0.40
borrow
0.39
borrowed
0.36
inspired
0.35
borrowing
0.34
based
0.33
bor
0.32
copied
0.31
Borrow
0.31
inspir
0.30
Activations Density 0.213%