INDEX
Explanations
occurrences of the word "bis."
New Auto-Interp
Negative Logits
rats
-0.16
instr
-0.15
chair
-0.15
atego
-0.15
433
-0.14
егоÑĢ
-0.14
isl
-0.14
_BLE
-0.14
tail
-0.14
indeb
-0.13
POSITIVE LOGITS
etz
0.16
till
0.15
razione
0.15
adh
0.15
enz
0.14
Bonjour
0.14
acher
0.14
erman
0.13
etta
0.13
éϵ
0.13
Activations Density 0.003%