INDEX
Explanations
phrases related to possession or ownership
the presence of specific frequently mentioned articles and demonstratives in context
New Auto-Interp
Negative Logits
"""
-0.66
Bridges
-0.64
Walls
-0.64
Examples
-0.60
Rounds
-0.59
ende
-0.59
ÄŁ
-0.59
[]
-0.57
Wade
-0.56
Sik
-0.56
POSITIVE LOGITS
same
0.90
utmost
0.86
dreaded
0.84
oneliness
0.76
elusive
0.72
elin
0.72
ocratic
0.71
odic
0.70
usual
0.69
proverbial
0.68
Activations Density 0.493%