INDEX
Explanations
pronouns followed by verbs representing actions or modifications of something
the pronoun "it" and variations of the pronoun "them."
New Auto-Interp
Negative Logits
Polk
-0.67
Sund
-0.63
idth
-0.62
Frontier
-0.61
abortions
-0.60
Radar
-0.59
heny
-0.59
Korea
-0.59
ãĥ©ãĥ³
-0.59
United
-0.56
POSITIVE LOGITS
atic
1.01
self
1.01
accordingly
1.01
alian
0.97
selves
0.94
atically
0.94
iner
0.81
zbollah
0.80
atical
0.79
yourself
0.77
Activations Density 0.130%