INDEX
Explanations
instances of the word "rain."
repeated occurrences of the substring 'ra'
New Auto-Interp
Negative Logits
wang
-0.78
creen
-0.74
Siem
-0.74
Phillip
-0.71
LG
-0.71
Younger
-0.69
Ful
-0.69
employment
-0.68
elect
-0.67
pair
-0.66
POSITIVE LOGITS
ra
3.71
Raider
2.04
Ra
1.66
gra
1.19
ra
1.19
rax
1.06
RA
1.06
raiding
0.98
RA
0.95
raising
0.90
Activations Density 0.025%