INDEX
Explanations
references to specific individuals and their interactions in a narrative context
New Auto-Interp
Negative Logits
ottes
-0.15
tÃŃ
-0.15
iente
-0.15
uyo
-0.14
İ
-0.14
оÑī
-0.14
_observer
-0.13
ihu
-0.13
warts
-0.13
TA
-0.13
POSITIVE LOGITS
about
0.36
about
0.31
regarding
0.28
åħ³äºİ
0.26
concerning
0.26
_about
0.25
About
0.25
About
0.24
vá»ģ
0.23
tentang
0.23
Activations Density 0.162%