INDEX
Explanations
distrust of outside sources
New Auto-Interp
Negative Logits
大変
0.69
mantel
0.68
convenient
0.68
oldukça
0.68
وینت
0.68
สะดวก
0.68
confortable
0.68
coaxial
0.67
Torque
0.67
весьма
0.66
POSITIVE LOGITS
new
0.80
s
0.74
m
0.70
sa
0.67
x
0.66
ri
0.63
sal
0.63
C
0.63
H
0.61
F
0.61
Activations Density 1.917%