INDEX
Explanations
occurrences of the word "the"
New Auto-Interp
Negative Logits
voordeel
-0.58
Phry
-0.58
Esau
-0.55
Assyrian
-0.55
writeFieldEnd
-0.55
Huguen
-0.54
Athenians
-0.53
Fascism
-0.53
quero
-0.52
luffy
-0.52
POSITIVE LOGITS
the
1.18
same
1.08
")));
1.07
".
0.98
)];
0.97
">
0.97
"]
0.96
"):
0.94
"]="
0.93
"])
0.93
Activations Density 0.509%