INDEX
Explanations
mentions of locations or places
occurrences of the word "let"
New Auto-Interp
Negative Logits
cumbers
-0.92
resil
-0.80
manship
-0.71
tremend
-0.69
ILY
-0.68
BLE
-0.68
DAY
-0.67
ecause
-0.67
Palestin
-0.66
holiest
-0.66
POSITIVE LOGITS
tered
1.10
ariat
1.07
ting
1.05
oad
0.95
own
0.93
ters
0.92
icia
0.91
arget
0.89
ocol
0.88
ective
0.86
Activations Density 0.043%