INDEX
Explanations
auxiliary verbs indicating possibility
words indicating existence or action in passive voice
capabilities
New Auto-Interp
Negative Logits
found
-0.77
pleaſure
-0.77
Found
-0.73
found
-0.73
find
-0.71
Saw
-0.69
usehen
-0.66
find
-0.66
ſtate
-0.65
ⓧ
-0.65
POSITIVE LOGITS
was
0.90
được
0.88
被
0.88
werd
0.81
也被
0.81
wurde
0.77
been
0.77
又被
0.76
wurden
0.72
být
0.72
Activations Density 1.935%