INDEX
Explanations
phrases indicating possession or ownership
New Auto-Interp
Negative Logits
nger
-0.17
inz
-0.17
emento
-0.15
elsius
-0.15
NB
-0.15
915
-0.15
idos
-0.15
ÑĢедиÑĤ
-0.15
gonna
-0.14
spo
-0.14
POSITIVE LOGITS
fun
0.19
difficulty
0.16
isay
0.16
conversations
0.15
conversation
0.15
eger
0.14
OwnProperty
0.14
blast
0.14
صر
0.14
Diff
0.14
Activations Density 0.278%