INDEX
Explanations
frequent occurrences of common pronouns and connecting words in text
New Auto-Interp
Negative Logits
jo
-0.14
/navbar
-0.14
-peer
-0.14
Maze
-0.13
MMdd
-0.13
jos
-0.13
peer
-0.13
iors
-0.13
lake
-0.13
ưng
-0.13
POSITIVE LOGITS
alion
0.17
atürk
0.16
ascus
0.15
Perkins
0.15
é¡»
0.15
oppers
0.14
Ỽ
0.14
odic
0.14
chner
0.14
æ´²
0.14
Activations Density 0.002%