INDEX
Explanations
verbs that indicate action or relationship in the context of processes or methodologies
New Auto-Interp
Negative Logits
tal
-0.77
R
-0.67
the
-0.64
p
-0.63
E
-0.63
r
-0.62
t
-0.62
ton
-0.62
hu
-0.62
be
-0.61
POSITIVE LOGITS
itſelf
1.32
]")]
1.06
ſelves
1.05
Jefus
1.03
ercises
1.03
ſelf
1.01
myſelf
1.01
كومونز
1.00
doubtnut
1.00
')")
1.00
Activations Density 0.475%