INDEX
Explanations
repeated instances of the word "the" in various contexts
New Auto-Interp
Negative Logits
rich
-0.54
pó
-0.52
N
-0.50
ած
-0.50
あれば
-0.50
And
-0.49
PHS
-0.47
scale
-0.47
нік
-0.47
guera
-0.46
POSITIVE LOGITS
IsContent
0.88
surla
0.85
Vidite
0.82
Enllaces
0.80
pinulongan
0.80
kuuta
0.77
RectangleBorder
0.76
bkz
0.74
awtextra
0.74
allAfrica
0.74
Activations Density 0.052%