INDEX
Explanations
terms related to distortions and misrepresentations
New Auto-Interp
Negative Logits
ept
-0.72
============
-0.70
psc
-0.68
çĦ
-0.66
esome
-0.66
¯¯¯¯
-0.66
sat
-0.65
cript
-0.65
gdala
-0.65
rises
-0.64
POSITIVE LOGITS
distorted
0.94
distortions
0.93
distort
0.91
distortion
0.91
perceptions
0.84
ibly
0.82
osc
0.76
oidal
0.72
ary
0.69
usional
0.69
Activations Density 0.008%