INDEX
Explanations
phrases indicating progression or improvement towards higher goals or levels
New Auto-Interp
Negative Logits
ontvangst
-0.15
aina
-0.14
iese
-0.14
vál
-0.14
олÑĮно
-0.14
è§Ħå®ļ
-0.14
ssa
-0.14
utow
-0.13
UpDown
-0.13
eree
-0.13
POSITIVE LOGITS
heights
0.44
levels
0.34
Heights
0.29
Levels
0.29
levels
0.29
height
0.28
height
0.27
Height
0.25
Height
0.25
HEIGHT
0.25
Activations Density 0.066%