INDEX
Explanations
instances of time-related phrases or clauses
New Auto-Interp
Negative Logits
ſelves
-0.92
Cæsar
-0.90
ſelf
-0.89
leſs
-0.89
myſelf
-0.87
themſelves
-0.86
faſt
-0.84
pleaſure
-0.83
houſe
-0.83
Shakspeare
-0.81
POSITIVE LOGITS
he
0.76
when
0.67
зулта
0.64
the
0.63
Référence
0.63
after
0.61
then
0.60
I
0.59
it
0.58
बाद
0.58
Activations Density 0.199%