INDEX
Explanations
the word "cat" within text
occurrences of the word "cat," particularly with different variations such as capitalized or combined with other words
occurrences of the word "cat."
New Auto-Interp
Negative Logits
Vander
-0.73
DPR
-0.67
Gree
-0.67
mble
-0.67
Fargo
-0.65
steroids
-0.64
Vaugh
-0.63
Matter
-0.62
Cere
-0.61
uden
-0.60
POSITIVE LOGITS
cat
1.35
aclysm
1.27
alogue
1.25
alog
1.24
alyst
1.22
apult
1.17
hedral
1.10
Cat
1.06
cats
1.05
heter
1.00
Activations Density 0.008%