INDEX
Explanations
occurrences of the word "Still" as a marker of continuity or contrast
New Auto-Interp
Negative Logits
eus
-0.16
adir
-0.15
incident
-0.15
Ngh
-0.15
AuthProvider
-0.14
_AMD
-0.14
hydr
-0.14
è½®
-0.14
zÄħ
-0.14
unidad
-0.13
POSITIVE LOGITS
çĦ¶
0.16
moth
0.15
ness
0.15
ington
0.14
pagan
0.14
pod
0.14
cha
0.14
od
0.14
777
0.14
argas
0.13
Activations Density 0.011%