INDEX
Explanations
instances of the word "same" in various contexts
New Auto-Interp
Negative Logits
á»Ļng
-0.16
rico
-0.16
ðŁĺī↵↵
-0.15
HING
-0.15
ento
-0.15
rus
-0.15
atu
-0.15
addtogroup
-0.14
лив
-0.14
rito
-0.14
POSITIVE LOGITS
token
0.21
941
0.18
384
0.17
moment
0.16
time
0.16
-token
0.16
359
0.15
tokens
0.15
="__
0.15
breath
0.15
Activations Density 0.014%