INDEX
Explanations
mentions of the word "Rock" within the text
occurrences of the word "Rock" in various contexts
New Auto-Interp
Negative Logits
URES
-0.75
unctions
-0.73
urers
-0.72
Chandra
-0.69
BILITIES
-0.66
sidx
-0.65
aver
-0.65
abet
-0.64
URE
-0.64
tampering
-0.62
POSITIVE LOGITS
ledge
1.11
cliffe
1.04
castle
1.01
ford
1.01
ingham
0.97
well
0.95
estone
0.93
ruff
0.92
fort
0.91
ete
0.90
Activations Density 0.021%