INDEX
Explanations
personal pronouns and references to entities, specifically "it" and "they"
"it" followed by a verb
it followed by a verb
New Auto-Interp
Negative Logits
']==
-0.35
...
-0.35
+
-0.30
]==
-0.30
Var
-0.29
RegressionTest
-0.29
Buss
-0.29
yn
-0.28
x
-0.28
×
-0.28
POSITIVE LOGITS
ſelf
0.76
ſelves
0.74
ähteet
0.71
majánló
0.70
témoig
0.70
堔
0.70
Majefty
0.68
awarkan
0.66
メンテナ
0.66
Italij
0.66
Activations Density 0.344%