INDEX
Explanations
terms related to instructions or technical details
words related to measurements and conditions
New Auto-Interp
Negative Logits
Brist
-0.50
Judd
-0.50
Inv
-0.50
Zac
-0.47
Fraz
-0.45
Hug
-0.45
Volunte
-0.44
gew
-0.43
ĵĺ
-0.43
————————
-0.43
POSITIVE LOGITS
utenberg
0.50
livious
0.49
endif
0.48
derog
0.47
osuke
0.46
yang
0.45
zon
0.45
lings
0.44
ongyang
0.44
ples
0.44
Activations Density 1.256%