INDEX
Explanations
instances of the word "look" and its variations in different forms
New Auto-Interp
Negative Logits
92
-0.15
94
-0.14
BITS
-0.14
dane
-0.14
ément
-0.14
ardy
-0.14
вÑĭглÑıд
-0.14
iom
-0.14
93
-0.13
avig
-0.13
POSITIVE LOGITS
closely
0.34
at
0.29
closer
0.27
Clo
0.24
Clo
0.23
carefully
0.22
clos
0.22
deeper
0.20
CLOSE
0.20
clo
0.20
Activations Density 0.040%