INDEX
Explanations
instances of the word "Bol" with varying strengths of activation
references to Bolivia and related terms
New Auto-Interp
Negative Logits
natureconservancy
-0.76
dfx
-0.71
PLIED
-0.66
Alien
-0.64
apers
-0.63
abyte
-0.62
EY
-0.62
Birch
-0.61
EDT
-0.61
Bir
-0.61
POSITIVE LOGITS
ivia
0.91
zar
0.83
ign
0.82
acas
0.79
uca
0.79
idays
0.78
ength
0.78
oning
0.76
atility
0.75
olic
0.75
Activations Density 0.025%