INDEX
Explanations
regular verbs, manual vs automatic
New Auto-Interp
Negative Logits
finding
0.48
drawing
0.46
namespace
0.46
pair
0.43
arm
0.43
grantor
0.43
copy
0.42
raised
0.42
target
0.42
specified
0.41
POSITIVE LOGITS
מ
0.50
apanicola
0.50
𝗴
0.50
violencia
0.50
ᥣ
0.50
ní
0.49
zejména
0.49
레
0.48
nejen
0.48
ಕ್
0.48
Activations Density 0.002%