INDEX
Explanations
instances of loss, theft, or deprivation related to personal and societal developments
New Auto-Interp
Negative Logits
unker
-0.17
izzo
-0.16
747
-0.16
737
-0.15
258
-0.15
oso
-0.15
plorer
-0.15
кÑĥл
-0.15
748
-0.14
Äijãi
-0.14
POSITIVE LOGITS
away
0.48
Away
0.40
snatch
0.38
taken
0.38
away
0.37
Away
0.33
taken
0.33
-away
0.33
Taken
0.32
stolen
0.31
Activations Density 0.181%