INDEX
Explanations
instances of the word "Text" and its variations
New Auto-Interp
Negative Logits
letter
-0.15
931
-0.15
éģ
-0.14
ispecies
-0.14
taire
-0.14
먹
-0.14
Letter
-0.14
View
-0.13
rats
-0.13
flip
-0.13
POSITIVE LOGITS
ured
0.28
ual
0.28
ural
0.23
URED
0.23
UAL
0.22
uring
0.21
ually
0.21
ura
0.20
ures
0.20
uality
0.19
Activations Density 0.026%