INDEX
Explanations
the word "were" and its various occurrences
past participles following were
New Auto-Interp
Negative Logits
<bos>
-0.58
Arhi
-0.45
ting
-0.44
pyplot
-0.44
onSubmit
-0.43
csin
-0.43
tan
-0.42
inet
-0.41
-0.41
Lait
-0.41
POSITIVE LOGITS
were
1.42
were
1.28
Were
1.16
Were
1.11
были
1.11
WERE
1.11
były
1.05
byly
0.99
були
0.98
estavam
0.94
Activations Density 0.149%