INDEX
Explanations
phrases and words indicating comparisons or changes over time
New Auto-Interp
Negative Logits
azzi
-0.17
strup
-0.17
erland
-0.15
kara
-0.15
icator
-0.15
.ribbon
-0.14
rian
-0.14
owler
-0.14
.clf
-0.14
ToWorld
-0.14
POSITIVE LOGITS
amba
0.15
tors
0.15
vale
0.15
visor
0.14
ais
0.14
ãĥĮ
0.14
anners
0.14
åľŃ
0.14
KIT
0.14
za
0.13
Activations Density 0.021%