INDEX
Explanations
instances of repeated phrases or contextual references with significance
New Auto-Interp
Negative Logits
anja
-0.16
seau
-0.15
alo
-0.15
acier
-0.14
anto
-0.14
stones
-0.14
Uniform
-0.14
Snow
-0.14
snow
-0.14
çĩ
-0.14
POSITIVE LOGITS
avage
0.16
exchange
0.16
lie
0.15
ibling
0.15
isku
0.15
ойно
0.15
ga
0.15
Lie
0.15
internal
0.14
Hay
0.14
Activations Density 0.015%