INDEX
Explanations
phrases indicating possession or ownership
New Auto-Interp
Negative Logits
ness
-0.38
ly
-0.37
используя
-0.36
izarse
-0.34
izing
-0.32
dom
-0.32
being
-0.32
ising
-0.31
setminus
-0.30
żesz
-0.30
POSITIVE LOGITS
been
1.16
gotten
0.91
access
0.88
reached
0.83
received
0.81
begun
0.79
difficulty
0.79
arrived
0.79
arrived
0.78
been
0.78
Activations Density 0.458%