INDEX
Explanations
references to the Sabbath
references to the Sabbath
New Auto-Interp
Negative Logits
IZE
-0.75
Tigers
-0.70
stal
-0.62
most
-0.62
CONTROL
-0.60
lda
-0.58
piece
-0.58
mobilize
-0.58
appreciate
-0.57
isolate
-0.57
POSITIVE LOGITS
abb
1.52
ucket
1.02
itt
0.96
azz
0.95
abis
0.92
arella
0.91
agn
0.88
arre
0.86
alist
0.86
oyle
0.86
Activations Density 0.004%