INDEX
Explanations
words and phrases that indicate large quantities or significant impact
New Auto-Interp
Negative Logits
ogle
-0.16
squeeze
-0.15
lesh
-0.15
gle
-0.14
ìĬ¤íĦ°
-0.14
.Extension
-0.14
occasional
-0.14
ustil
-0.14
ende
-0.13
бом
-0.13
POSITIVE LOGITS
multiple
0.25
repeat
0.24
Repeat
0.23
multiple
0.23
repeat
0.23
repeated
0.22
repetition
0.22
Repeat
0.21
Multiple
0.21
repeats
0.21
Activations Density 0.020%