INDEX
Explanations
phrases related to instructions or technical details
references to electrical appliances and their safety instructions
New Auto-Interp
Negative Logits
ĵĺ
-0.56
Helic
-0.55
fol
-0.54
NetMessage
-0.52
gew
-0.52
Faul
-0.50
bris
-0.50
Brist
-0.49
Prem
-0.49
grab
-0.49
POSITIVE LOGITS
endif
0.64
TextColor
0.59
orgetown
0.57
hover
0.55
miah
0.54
celona
0.54
ovi
0.53
Logged
0.51
pton
0.51
oser
0.51
Activations Density 0.866%