INDEX
Explanations
label followed by colon and value
New Auto-Interp
Negative Logits
asunto
0.44
Invite
0.39
ﻳ
0.39
endroits
0.38
Required
0.38
assuntos
0.38
Depend
0.37
prompt
0.37
Become
0.37
Places
0.36
POSITIVE LOGITS
удобно
0.47
ów
0.43
había
0.40
渑
0.40
$.}
0.40
דם
0.40
ῆς
0.38
-}\
0.38
ार्ड
0.38
ël
0.38
Activations Density 0.004%