INDEX
Explanations
instances of the word "resist" and its variations, indicating a theme of opposition or defiance
New Auto-Interp
Negative Logits
hem
-0.16
hop
-0.15
kip
-0.15
edy
-0.14
sworth
-0.14
еÑģÑĤо
-0.13
heten
-0.13
кав
-0.13
isses
-0.13
ahoma
-0.13
POSITIVE LOGITS
opyright
0.18
eenth
0.16
ior
0.15
LER
0.15
/mit
0.14
uttgart
0.14
Resistance
0.14
against
0.14
der
0.14
Stud
0.14
Activations Density 0.025%