INDEX
Explanations
instances of the Spanish word "de" indicating possession or relationship
New Auto-Interp
Negative Logits
faſt
-0.90
raiſ
-0.79
ſelf
-0.78
myſelf
-0.78
kasarigan
-0.78
ſta
-0.76
itſelf
-0.76
pleaſure
-0.75
ſtand
-0.74
ſever
-0.74
POSITIVE LOGITS
de
1.14
of
0.96
OF
0.77
De
0.75
di
0.74
ของ
0.73
Of
0.72
של
0.71
of
0.71
De
0.71
Activations Density 0.002%