INDEX
Explanations
occurrences of the word "per" and its variations
New Auto-Interp
Negative Logits
abile
-0.17
wright
-0.15
.tt
-0.15
scal
-0.15
egg
-0.15
輪
-0.14
egend
-0.14
iard
-0.14
sworth
-0.14
bling
-0.14
POSITIVE LOGITS
ohl
0.21
anto
0.18
quisites
0.18
vez
0.15
Tet
0.15
ceptions
0.15
hn
0.15
oxide
0.14
è©
0.14
man
0.14
Activations Density 0.086%