INDEX
Explanations
phrases that indicate strain, burden, or impact on systems or individuals
New Auto-Interp
Negative Logits
otas
-0.17
owing
-0.16
atsu
-0.16
оÑĤа
-0.16
ota
-0.15
nest
-0.15
isti
-0.15
OTA
-0.14
Ľi
-0.14
583
-0.14
POSITIVE LOGITS
strain
0.33
dam
0.30
dent
0.26
strains
0.26
Dam
0.26
lid
0.25
brakes
0.23
k
0.23
dam
0.22
brake
0.22
Activations Density 0.033%