INDEX
Explanations
instances of the word "only."
New Auto-Interp
Negative Logits
atleast
-0.17
både
-0.16
Nearly
-0.16
èĩ³å°ij
-0.16
åĩłä¹İ
-0.15
actionTypes
-0.15
BOTH
-0.14
Nearly
-0.14
REATED
-0.14
поÑĩÑĤи
-0.14
POSITIVE LOGITS
partial
0.20
a
0.19
token
0.18
enough
0.18
moderate
0.17
minor
0.17
one
0.17
modest
0.17
limited
0.17
ifiable
0.16
Activations Density 0.083%