INDEX
Explanations
phrases that indicate time or location-related references
New Auto-Interp
Negative Logits
vert
-0.16
ейн
-0.14
imit
-0.14
opal
-0.14
zd
-0.14
istor
-0.14
aura
-0.14
582
-0.14
üz
-0.14
Eug
-0.14
POSITIVE LOGITS
scale
0.19
levels
0.18
elic
0.17
every
0.15
grass
0.15
stake
0.15
Scale
0.15
levels
0.15
scales
0.15
every
0.15
Activations Density 0.071%