INDEX
Explanations
repeated phrases or transitions in narrative descriptions
New Auto-Interp
Negative Logits
wheel
-0.15
adium
-0.15
UNCT
-0.15
оÑĩно
-0.14
agon
-0.14
ervo
-0.14
behalf
-0.14
gebra
-0.14
μά
-0.14
ingo
-0.13
POSITIVE LOGITS
bred
0.25
ogh
0.19
786
0.18
-out
0.18
suá»ijt
0.18
puts
0.18
s
0.18
aus
0.17
ought
0.16
enger
0.16
Activations Density 0.113%